Style Rank, a unified benchmarking framework for generative styling models in PyTorch. This repository contains code wrapping the implementations of several papers in the field of generative styling models and implementation of metrics to evaluate the quality of the generated images. We also provide Style-Rank, an evaluation dataset for comparison of the models.
This work was developed by Eyal Benaroche, Clément Chadebec, Onur Tasar, and Benjamin Aubin from Jasper Research and Ecole Polytechnique.
Model | Arxiv | Code | Project Page | Notes |
---|---|---|---|---|
StyleAligned | Arxiv | Code | Project Page | |
VisualStyle | Arxiv | Code | Project Page | |
IP-Adapter | Arxiv | Code | Project Page | Using the implementation from Diffusers |
InstantStyle | Arxiv | Code | Project Page | Using the implementation from Diffusers |
CSGO | Arxiv | Code | Project Page | |
Style-Shot | Arxiv | Code | Project Page |
We implemented several common metrics to evaluate the quality of the generated images:
- CLIP-Text metric : Cosine Similarity between a caption (embedded using
ClipTextModel
) and the generated image (embedded usingClipVisionModel
) - Using the implementation from Transformers - CLIP-Image metric : Cosine Similarity between two images (embedded using
ClipVisionModel
) - Using the implementation from Transformers - Dino : Cosine Similarity between two images (embedded using
Dinov2Model
) - Using the implementation from Transformers - ImageReward : Score from the ImageReward model
The dataset is an aggregation of images from multiple styling papers:
To be up and running, you need first to create a virtual env with at least python3.10
installed and activate it
python3.10 -m venv envs/style_rank
source envs/style_rank/bin/activate
conda create -n style_rank python=3.10
conda activate style_rank
Then install the required dependencies (if on GPU) and the repo in editable mode
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .
Using the provided code, you can generate stylized images on the provided datasets (or your own given the right format) and evaluate them using the provided metrics. Results can fluctuate as the generation is not seeded and the default prompts are sampled from a list of prompts.
The dataset is formated to be used with WebDataset
You can download it locally
wget -O data/stylerank_papers.tar "https://huggingface.co/datasets/jasperai/style-rank/resolve/main/stylerank_papers.tar"
Or you can also stream it from HuggingFace with webdataset:
import webdataset as wds
url = f"pipe:curl -s -L https://huggingface.co/datasets/jasperai/style-rank/resolve/main/stylerank_papers.tar"
dataset = wds.WebDataset(url).decode('pil')
sample = next(iter(dataset))
sample["jpg"].show()
The dataset contains license
, source
, url
, caption_blip
, caption_cogvlm
, style_caption
and style_captionner
metadata located as follows:
sample = {
'__key__': image_key,
'jpg': image_data,
'json': {
'license': image_license,
'source': image_source,
'url': original_dataset_url,
'caption_blip': blip2_caption,
'caption_cogvlm': cogvlam_caption,
'style_caption': style_caption,
'style_captionner': style_captioner,
}
}
To generate images using one of the provided models, you can use the scripts provided in the examples/inference
folder. For example, to generate images using the StyleAligned
model, you can use the following command :
python examples/inference/stylealigned.py [--input-path /path/to/dataset] [--output-path /path/to/output]
Default output path is output/inference/
and the default input path is data/stylerank_papers.tar
.
Addtionally, you can provide the --json_path
argument to use a different json file for the prompts or use the --prompts
argument to provide a list of prompts to use for the generation.
Iterating throught the provided .tar
file and generate 4 random images based on the prompts provided in the prompts.json
file, following a similar evaluation process as the one described in the VisualStyle paper.
The folder structure should be as follows :
.
├── README.md
├── data
│ ├── stylerank_papers.tar
│ └── prompts.json
├── examples
│ ├── inference
│ └── report
├── output
│ ├── inference
│ └── metrics
├── requirements.txt
├── setup.py
├── src
│ └── stylerank
└── tests
├── reference_images
├── test_metrics
└── test_model
When running an inference script, the model will by default create a folder with its name to store the generated samples and the reference image using a new folder for each reference (with it's key as name) and the prompts used to generate it. The folder structure should look like this inside the ./output/
folder:
.
├── inference
│ ├── instant_style
│ │ ├── 0000
│ │ │ ├── prompt_1.png
│ │ │ ├── prompt_2.png
│ │ │ ├── prompt_3.png
│ │ │ ├── prompt_4.png
│ │ │ └── reference.png
│ │ ├── 0001
. . . ....
│ │ └── 0111
│ ├── ip_adapter
│ │ ├── 0000
│ │ ├── 0001
. . . ....
│ │ └── 0111
│ ├── stylealigned
. . └── ....
│ └── visualstyle
│ └── ....
└── metrics
├── interrupted.csv
├── report.csv
└── metrics.csv
Given the generated image you can evaluate the results using the provided metrics. For example, to evaluate the generated images using the CLIP-Text
metric, you can use the following command :
python examples/report/metrics.py --metrics ClipText [--input-path /path/to/dataset] [--output-path /path/to/output]
You can run multiple metrics at once by providing a list of metrics to the --metrics
argument, ie :
python examples/report/metrics.py --metrics "[ClipText, ClipImage, Dinov2, ImageReward]" [--input-path /path/to/dataset] [--output-path /path/to/output]
It will output the results in the /path/to/output/metrics.csv
file and the mean for each metric in the /path/to/output/report.csv
file.
If you cancel the process, it will automatically save the results in the /path/to/output/interrupted.csv
file.
Running the evaluation on the provided stylerank_papers.tar
dataset, we get the following results :
Model | ImageReward ↑ | Clip-Text ↑ | Clip-Image ↑ | Dinov2 ↑ |
---|---|---|---|---|
StyleAligned | -1.26 | 19.26 | 68.72 | 36.29 |
VisualStyle | -0.72 | 22.12 | 66.68 | 20.80 |
IP-Adapter | -2.03 | 15.01 | 83.66 | 40.50 |
Style-Shot | -0.38 | 21.34 | 65.04 | 23.04 |
CSGO | -0.29 | 22.16 | 61.73 | 16.85 |
InstantStyle | -0.13 | 22.78 | 66.43 | 18.48 |
Inversion-InstantStyle | -1.30 | 18.90 | 76.60 | 49.42 |
To run the tests to make sure the models and metrics are working as expected, you need to install pytest and run the tests using the following command :
pip install pytest
pytest tests/
This code is released under the Creative Commons BY-NC 4.0 license.
If you find this work useful or use it in your research, please consider citing us
@misc{benaroche2024stylerank,
title={Style-Rank: Benchmarking stylization for diffusion models},
author=Eyal Benaroche and Clement Chadebec and Onur Tasar and Benjamin Aubin},
year={2024},
}