0.8 C
United States of America
Thursday, January 9, 2025

Flux Handwriting Mannequin: AI Mimicing Human Handwriting


I by no means pictured AI writing human-like textual content, no I’m not speaking in regards to the textual content era however reasonably picture era with human handwriting. The Flux fashions made it straightforward to deduce, generate and edit pictures. In the present day, on this article, we’ll be taking a look at one such mannequin used for producing pictures with hand-written textual content. No that’s not all, we’ll additionally construct a narrative telling software in direction of the top of the article. 

What are FLUX Fashions?

Flux fashions are generative fashions which are typically related to producing high-quality pictures, movies, or different content material. These fashions are constructed utilizing superior neural networks like Steady Diffusion or Variational Autoencoders (VAEs). We’ll be specializing in a Flux mannequin, particularly the fofr/flux-handwriting mannequin all through the article.

Flux Handwriting Mannequin

“fofr/flux-hndwriting” is a flux Lora fine-tuned to supply handwritten textual content, let’s take a look at varied methods to make use of it to generate some pictures with handwritten textual content.

Hugging Face

You’ll be able to shortly to the mannequin web page on the Hugging Face and use the ‘diffusers library’ or the ‘inference api’ to generate the pictures. 

Word: Do not forget that you must use HWRIT handwriting to set off the picture era.

Hugging Face

I prompted it to generate: ““HWRIT shaky messy handwriting stating “The solar will rise,” illegible, darkish inexperienced ink on outdated water-damaged paper with seen mould marks.““

The generated picture has the textual content in the identical model I had talked about within the immediate.

Let’s attempt the Inference API

Get your HuggingFace entry token from right here: Hugging Face Tokens

from huggingface_hub import InferenceClient
shopper = InferenceClient("fofr/flux-handwriting", token="hf_token")
# output
picture = shopper.text_to_image('HWRIT scrawling messy handwriting saying "I am Iron Man", illegible, written with a HB pencil on a grainy paper')

Output

output

This seems to be fairly good and the mannequin didn’t mess up with any characters too. 

Word: It takes some time for the picture era. 

Replicate

You may as well select to run this mannequin on Replicate nevertheless it’ll price you roughly $1 for 90 runs  or roughly $0.11/run, this could fluctuate.

Story Telling Software

Let’s create an LLM software that first writes a narrative after which breaks into 7 items after which generates 7 hand-written pictures to assist the storytelling. We’ll then mix these pictures to finish the appliance.

We’ll be utilizing Gemini fashions to generate the story and make prompts to generate pictures from flux-handwriting. First, let’s get on our palms on the Gemini API-key:

api key

Merely click on on Create API Key to get a brand new key to make use of Gemini fashions.

Set up

!pip set up -q -U google-generativeai

To make use of the Gemini fashions.

Implementation

Configure and select the mannequin, I’ll be utilizing the ‘Gemini-1.5-flash’ mannequin.

import google.generativeai as genai
genai.configure(api_key=”API-Key”)
mannequin = genai.GenerativeModel("gemini-1.5-flash")

Producing the story:

response = mannequin.generate_content("Write a brief and clear story in about 80 phrases, a few day within the lifetime of a person named Cyan turning right into a superhero.")
story = response.textual content
print(story)
Cyan woke to a throbbing headache, an odd image burning into his palm.
  That day, mundane duties – grocery procuring, canine strolling – felt amplified,
his senses sharper. A rushing automobile careened in direction of a toddler; instinctively,
Cyan reacted.  He moved sooner than he thought attainable, a blur of movement,
saving the kid.  The image glowed.  He was not simply Cyan. He was
one thing extra.

Now break up the story into 7 elements:

sentences = story.break up(". ")
prompts = [f"{sentences[i]}." if i < len(sentences) else "" for i in vary(7)]

Construction the 7 elements into prompts to request for a response:

handwriting_prompts = [
   f"HWRIT handwriting style for the text: '{prompt}' in a neat cursive writing in orange Ink and red paper background"
   for prompt in prompts if prompt.strip()
]

Operate to generate the handwritten pictures:

(Get your hugging face token and ensure to verify all of the inference packing containers whereas making a token)

from huggingface_hub import InferenceClient
import time
shopper = InferenceClient("fofr/flux-handwriting", token="hf_token")
def handwriting_text(immediate):
 picture = shopper.text_to_image(immediate)
 return picture

Producing pictures with handwritten textual content:

handwritten_images = []
for immediate in handwriting_prompts:
   picture = handwriting_text(immediate)
   handwritten_images.append(picture)
   time.sleep(120)  # 2-minute delay
Word: The API request may throw an error on the strains of “Max requests
complete reached on picture era inference (3). Wait as much as one minute
earlier than having the ability to course of extra Diffusion requests.”, Therefore we’re including a
120 second sleep after every request within the for loop.

Producing the video utilizing OpenCV: 

import cv2
import os
def create_video_from_images(image_list, output_video_path, fps=1):
   # Load the primary picture to get dimensions
   body = cv2.imread(image_list[0])
   top, width, _ = body.form
   # Initialize the video author
   fourcc = cv2.VideoWriter_fourcc(*'mp4v')
   video = cv2.VideoWriter(output_video_path, fourcc, fps, (width, top))
   # Write every picture to the video
   for image_path in image_list:
       body = cv2.imread(image_path)
       video.write(body)
   # Launch the video author
   video.launch()
# Save the pictures to disk and create a video
image_file_paths = []
for idx, picture in enumerate(handwritten_images):
   file_path = f"handwritten_image_{idx}.png"
   picture.save(file_path)
   image_file_paths.append(file_path)
# Mix pictures right into a video
create_video_from_images(image_file_paths, "handwritten_story.mp4", fps=0.25)
print("Video created: handwritten_story.mp4")

We saved the pictures after which mixed them right into a video with a body charge of 0.25 (1 body per 4 seconds for readability). 

Output

Hyperlink to the video: handwritten_story.mp4

Output

Word: The mannequin struggles whereas producing pictures with greater than 4-5 phrases per picture so we have to prohibit the textual content. 

One train you might attempt is to make use of an LLM working to make the prompts as an alternative of splitting the story and utilizing a typical template, this can make sure the textual content restrict and the model and background of the textual content could be tuned based on the textual content by the LLM. 

Conclusion

In conclusion, utilizing Flux fashions equivalent to “fofr/flux-handwriting” introduces new alternatives for crafting customized handwritten-style visuals. Whether or not creating standalone prompts or growing full storytelling options, these instruments spotlight AI’s skill to merge inventive creativity with sensible purposes. The storytelling characteristic exemplifies how effortlessly AI-generated visuals can mix into multimedia tasks, driving ahead creative and fascinating prospects. 

Additionally if you’re on the lookout for a Generative AI course on-line then, discover: GenAI Pinnacle Program

Steadily Requested Questions

Q1. What’s the “flux-handwriting” mannequin?

Ans. The “flux-handwriting” mannequin is a LoRA (Low-Rank Adaptation) fine-tuned model of the FLUX.1-dev mannequin, designed to generate pictures of handwriting in varied kinds based mostly on textual content prompts. 

Q2. How do I exploit the “flux-handwriting” mannequin with the Diffusers library?

Ans. First, load the bottom FLUX.1-dev mannequin utilizing the Diffusers library. Then, apply the “flux-handwriting” LoRA weights to the pipeline. Lastly, generate pictures by offering prompts.

Q3. What are the set off phrases for this mannequin?

Ans. To activate the handwriting era characteristic, embrace the set off phrase HWRIT handwriting in your immediate. 

This autumn. How else can I entry this mannequin aside from Hugging Face?

Ans. You should utilize Replicate to deduce utilizing the fofr/flux-handwriting mannequin: Replicate.

I am a tech fanatic, graduated from Vellore Institute of Expertise. I am working as a Knowledge Science Trainee proper now. I’m very a lot excited about Deep Studying and Generative AI.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles