Skip to content

Latest commit

 

History

History
11 lines (6 loc) · 443 Bytes

multimodal-model-trained-on-comicbooks.md

File metadata and controls

11 lines (6 loc) · 443 Bytes

multimodal model trained on comicbooks

https://twitter.com/DigThatData/status/1550184909817274371

use "by artist name" to construct additional training prompts

segment pages into cells

separate dataset for cleanly segmented comics vs comics with images that extend past cell borders or otherwise have creative use of space that will complicate dataset construction and parsing