hyper-embedding fit a model using an implicit representation for its token embeddings. for finetuning: finetune the CINN only (or maybe just condition the outputs)