Skip to content

replicate/cog-flux

Repository files navigation

cog-flux

This is a Cog inference model for FLUX.1 [schnell] and FLUX.1 [dev] by Black Forest Labs. It powers the following Replicate models:

Features

  • Compilation with torch.compile
  • Optional fp8 quantization based on aredden/flux-fp8-api, using fast CuDNN attention from Pytorch nightlies
  • NSFW checking with CompVis and Falcons.ai safety checkers
  • img2img support

Getting started

If you just want to use the models, you can run FLUX.1 [schnell] and FLUX.1 [dev] on Replicate with an API or in the browser.

The code in this repo can be used as a template for customizations on FLUX.1, or to run the models on your own hardware.

First you need to select which model to run:

script/select.sh {dev,schnell}

Then you can run a single prediction on the model using:

cog predict -i prompt="a cat in a hat"

The Cog getting started guide explains what Cog is and how it works.

To deploy it to Replicate, run:

cog login
cog push r8.im/<your-username>/<your-model-name>

Learn more on the deploy a custom model guide in the Replicate documentation.

Contributing

Pull requests and issues are welcome! If you see a novel technique or feature you think will make FLUX.1 inference better or faster, let us know and we'll do our best to integrate it.

Rough, partial roadmap

  • Serialize quantized model instead of quantizing on the fly
  • Use row-wise quantization
  • Port quantization and compilation code over to https:/replicate/flux-fine-tuner

License

The code in this repository is licensed under the Apache-2.0 License.

FLUX.1 [dev] falls under the FLUX.1 [dev] Non-Commercial License.

FLUX.1 [schnell] falls under the Apache-2.0 License.