Basics of AI, LLMs, GPTs and Generative AI for Freelancers
Several friends and colleagues have privately asked me if the MIT “Applied Generative AI for Digital Transformation” course was worth taking (it is quite expensive and intense) after I shared my certificate of successful completion. My answer is that it was definitely worth it for me in my quest to find the most impactful use of Generative AI for freelancers beyond image and video generation, though perhaps not as much as for many of my classmates who hold CIO or other digital transformation roles in companies across various industries worldwide.
Additional great benefit for me was reading submitted forum assignments and peeking into AI use cases for all sorts of purposes in business and education.
While the course covered a lot of ground that wasn’t directly relevant to my professional or personal goals, it did provide me with a strong foundation for what my intentions are, allowing me to continue confidently exploring how we, freelancers in visual content creation, can use AI capabilities and emerging tools to supercharge personal and professional growth, as well as take our freelance business to the next level.
If you’re curious about how Generative AI can benefit your business and professional development, let’s start with the basics today and explore its full potential together going forward.
Artificial Intelligence: Quick Overview
The concept of Artificial Intelligence (AI) has evolved significantly since its inception in the mid-20th century, when pioneering computer scientists began exploring the potential of machines that could mimic human intelligence.
From simple algorithms to complex neural networks, AI has allowed for innovations that seemed beyond the realm of possibility in the past.
Artificial Intelligence (AI) is a broad field that involves creating machines capable of performing tasks that are normally associated with human cognition.
Within AI, there are Large Language Models (LLMs) which are specific types of AI designed to understand and generate human-like text based on the information they’ve been trained on.
Understanding Large Language Models (LLMs)
Large Language Models (LLMs), for example OpenAI’s GPT (Generative Pre-trained Transformer), are a type of artificial intelligence designed to understand and generate text the way we humans do.
Think of them as incredibly smart digital assistants that can help write essays, stories, even poetry, summarize articles, and answer complex questions.
How LLMs Are Trained:
- Collecting Text Data: LLMs start by gathering a huge amount of text from books, websites, articles, and other written sources. This collected text helps them understand language patterns, grammar, facts, and general knowledge.
- Pre-Training: During pre-training, the model processes the massive dataset and learns to predict the next word in a sentence, given the words that came before it. For example, if the model sees “The sky is ____,” it might predict “blue” based on the context.
- Fine-Tuning: Once the initial training is complete, the model undergoes fine-tuning, where it’s adjusted to perform specific tasks like answering questions, summarizing text, or chatting. This involves training the model on smaller, task-specific datasets to refine its abilities.
The way the model is trained determines the best practices for its use – that’s why the same prompts and methods can help generate great content in, say, ChatGPT-4, but not do so well in Gemini.
To give you an idea of the amount of information used to train LLMs and the duration of such training, here’s a screenshot from Andrej Karpathy’s (computer scientist and co-founder of OpenAI) presentation we were introduced to in the course:
Why LLMs Are Useful For Us:
- Versatility: They can help with writing, brainstorming, coding, and answering questions across various topics.
- Speed and Scale: LLMs can process and generate text much faster than humans and can handle large-scale tasks quickly.
- Ideation & Brainstorming: They can suggest new ideas, explore different writing styles, and inspire creative projects.
By understanding the basics of LLMs and the best practices for using them (I will explore valuable use cases of Generative AI for freelancers in a different post), we, independent creatives, can see how these AI tools can become valuable partners in our work, helping us generate ideas, streamline tasks, and explore new opportunities much faster than we could even imagine before!
Here’s a fun educational video about LLMs, which is created using Gen AI tools:
What Are GPTs?
Generative Pre-trained Transformers (GPTs) are a type of large language model (LLM) that have a much broader application than just text generation.
While they are primarily designed for text generation, they are instrumental in creating text-based prompts, scripts, and ideas that can be used by specialized generative models to produce visual content.
Indirect Generation via Text Prompts:
GPTs themselves are not directly responsible for generating images and videos, but they help produce descriptive text prompts that guide other models like DALL·E, MidJourney or Stable Diffusion.
- DALL·E: Developed by OpenAI, DALL·E is a neural network that generates images based on text descriptions. GPTs can write detailed prompts that DALL·E uses to create images.
- MidJourney: Another very popular AI-powered platform specializing in creating visually stunning, artistic images from textual prompts.
- Stable Diffusion: An open-source AI model that can generate high-quality images from well-crafted prompts created by GPTs.
🟡 Side-note: MidJourney is my image generation platform of choice. It has recently moved to a dedicated website MidJourney Alpha from Discord and the UI is now so much cleaner and easier to navigate! Amazing news to its users! Here’s a screenshot from my account:
There are also Image-to-Text and Text-to-Image GPTs, which can describe the content of existing images or photos (image-to-text) or create prompts that enable generative models to visualize specific scenes (text-to-image).
Video Generation GPTs which cannot directly generate videos, but can provide scripts or storyboards for video content.
Multimodal Models, like GPT-4 Vision (a multimodal version of GPT-4), which can analyze images and provide text-based descriptions or answers about image content, and generate descriptive captions for images, which can be part of a video creation pipeline.
Collaborative Applications for Image and Video Generation Workflows: GPTs write the prompts, which are then passed to image-generating models like DALL·E or Midjourney, or scripts to guide video-generating models such as Runway Gen-2 or Synthesia.
What is Generative AI?
Generative AI is a broader term that refers to AI systems that can generate content, whether it’s text, images, video or music, while LLMs like GPTs are a subset of generative AI focused specifically on text generation as we’ve just explored above.
Essentially, AI is the overarching concept, generative AI includes any AI that creates new content, and LLMs like GPTs are specialized tools within that category for dealing with text.
My personal simplified take on how I see the use of Generative AI for freelancers is this:
Imagine a huge virtual toy grabber arcade claw machine – the claw is the GPT you lead (through prompts) to grab a specific toy you want (content – text or visuals) which GPT will assemble for you from a massive pile of parts of different toys in it.
In order to successfully assemble and grab just the toy (content) you need, you have to have a goal in mind – what will your toy function and look like (what is the information that you need the Gen AI tool to analyze and assemble in a specific way for you); a strategy on how to get that specific result (know how the Gen AI tool you are using is trained to function and give it the input data that it will correctly interpret); and an understanding of how to actually lead the claw to your desired toy (prompt generation and adjustment for the specific Gen AI tool you are using).
Like I said, simplified…
Just for the fun of it, I’ve used the descriptive part of how I imagine GPTs work and plugged it into MidJourney – it actually put “GPT” onto this thing 😊
This demonstrates that 1 prompt may get you something visually ok, but in order to actually get a proper representation of the concept you have in mind, you have to tweak your prompt, reiterate, create variations.
But the more you practice doing that, the better you become at getting to the right outcome faster.
Final Thoughts
Today, Generative AI for freelancers stands as one of the most groundbreaking developments in the field of artificial intelligence as it has already began to transform the content creation landscape.
This technology is not just enhancing the way we work but revolutionizing it, especially for creative professionals and knowledge workers.
Here’s a moodboard I generated for a male fragrance product photoshoot. Some might argue that I could have created this manually using traditional methods, but it took me a minute to write the prompt and have this whole thing compiled for me. As a busy creative professional, I’ll always choose tools that save me precious time.
Of course, some creatives may choose to reject AI tools, much like some analog photographers resisted digital cameras and Photoshop, and painters before them dismissed photography as an artistic medium, and that’s their prerogative.
I see generative AI as a new tool that requires learning and practice, just like any other, to consistently generate valuable content for all sorts of purposes.
Humans learn by absorbing data through our senses throughout our lives, and it shapes our understanding of this world, our perspective and creative vision, and what we put out into the world as a result.
Imagine a painter who creates new pieces by integrating styles from different art movements. Or a beginner photographer who is emulating other photographers’ work while learning the craft and tools to eventually form their own style.
Generative AI works similarly, but in the digital realm – and much, much faster – following our masterful creative direction.
It analyzes extensive data — images, text, code, etc. — to recognize patterns and characteristics that it can then use to create something new for us.
I am convinced that this technology – at least at this stage of its development, and let’s hope further down the line – is not about replacing human creativity but enhancing it, providing tools that can bring new ideas to life with efficiency and precision (trust, but verify, though!).
“Machine Learning is a paradigm of computer programming where, rather than applying deductive logic to produce output that is known to be correct like a pocket calculator, programs are designed to produce predictions, which are expected to be occasionally wrong”
– an excerpt from Colin Fraiser’s article linked below.
🟡 Hallucinations, Errors, and Dreams: On why modern AI systems produce false outputs and what there is to be done about it by Colin Fraser.
I hope this was helpful!
We will look into practical ways to use generative AI tools for our personal and professional growth next time!
More Food For Thought
🔴 Explore Custom GPTs created through OpenAI’s ChatGPT platform: GPTs Statistics & Top 100 GPTs Ranked (Jan 2024) by Seo.ai (April 24, 2024);
🔴 These Custom GPTs will help you optimize your tasks with personalized AI capabilities: The 36 Best Custom GPTs of 2024 (Curated by Humans) by Seo.ai (April 24, 2024);
🔴 11 Best GPTs On the OpenAI Store That Will Actually Save You Time by Tech.co (January 30, 2024).