The world of AI has been hyped for more than two years now since the release of ChatGPT in November 2022. New tools and technologies emerge daily, promising to revolutionize our work and lives. If you're looking to harness the power of large language models (LLMs) for personal projects or even professional applications, Ollama might be the key. In this post, you will learn what Ollama is, how to download, install, and use Ollama to run a small model, and then run DeepSeek R1 (the other super popular LLM); let’s get started.

LLMs generally reply in a nondeterministic format; it does not always comply with the formatting instructions given. This is where controlled generation (structured output) comes into play, where you ask an LLM to reply to comply with a given schema. In this post, you will learn how to use Gemini over Vertex AI and controlled generation to get structured output that follows a schema on job listings to summarize and categorize them. Let's get going!

Ollama is a great way to run many open Large Language Models (LLMs). You can run Google Gemma 2, Phi 4, Mistral, and Llama 3 on your machine or the cloud with Ollama. You can also host these open LLMs as APIs using Ollama. In this post, you will learn how to host Gemma 2 (2b) with Ollama 0.5.x on Google Cloud Run; let’s get started!

Do you feel your resume, LinkedIn profile, or GitHub contributions must convey the right message? Your CV, LinkedIn profile, and GitHub repositories are your digital storefront, and keeping them fresh and relevant is key to attracting opportunities. In this post, we'll explore leveraging Google Gemini 2.0's real-time streaming capabilities to improve your CV, LinkedIn, and GitHub profile, focusing on practical examples and actionable strategies for you to land a tech role. Let's dive in!

More posts can be found in the archive.

Stay Connected

Follow me on LinkedIn for new posts, engineering insights, and tech takes — straight from the trenches.

Follow on LinkedIn  →