Performing Linear Regression with Python and Scikit-learn

Sometimes, life is easy. There are times when you are building a Machine Learning model for regression and you find your data to be linear. In other words, a regression model can be fit by means of a straight line. While these cases are relatively rare, linear regression is still a useful tool for in your Machine Learning toolkit.

What is Linear Regression? And how does it work? That's what we will investigate in today's Machine Learning article. It is structured as follows. First of all, we will be introducing Linear Regression conceptually, specifically Ordinary Least Squares based Linear Regression. We'll look at what regression is in the first place, and then introduce the linear variant - explaining the maths behind it in an intuitive way, so that it'll be entirely clear what is going on. We also cover how Linear Regression is performed, i.e., how after regressing a fit the model is improved, yielding better fits.

Subsequently, we'll move from theory into practice, and implement Linear Regression with Python by means of the Scikit-learn library. We will generate a dataset where a linear fit can be made, apply Scikit's LinearRegression for performing the Ordinary Least Squares fit, and show you with step-by-step examples how you can implement this yourself.

Let's take a look :)

Introducing Linear Regression

In this section, we will be looking at how Linear Regression is performed by means of an Ordinary Least Squares fit. For doing so, we will first take a look at regression in general - what is it, and how is it useful? Then, we'll move forward to Linear Regression, followed by looking at the different types for performing regression analysis linearly. Finally, we zoom in on the specific variant that we will be using in this article - Oridnary Least Squares based linear regression - and will explore how it works.

Of course, since we're dealing with a method for Machine Learning, we cannot fully move away from maths. However, I'm not a big fan of writing down a lot of equations without explaining them. For this reason, we'll explain the math in terms of intuitions, so that even though when you cannot fully read the equations, you will understand what is going on.

What is Regression?

Most generally, we can define regression as follows:

Regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome variable') and one or more independent variables (often called 'predictors', 'covariates', or 'features').

In other words, suppose that we have the following dataset:

Hi, I'm Chris!

I know a thing or two about AI and machine learning. Welcome to MachineCurve.com, where machine learning is explained in gentle terms.

Getting started

Foundation models

Learn how large language models and other foundation models are working and how you can train open source ones yourself.

Keras

Keras is a high-level API for TensorFlow. It is one of the most popular deep learning frameworks.

TensorFlow

TensorFlow is the most popular deep learning framework. It is is used by many companies.

PyTorch

PyTorch is a deep learning framework which is popular for its ease of use and flexibility.

Machine learning theory

Read about the fundamentals of machine learning, deep learning and artificial intelligence.

Transformer architectures

Emerging since 2017, Transformer architectures are part of the state of the art in deep learning.

Most recent articles

January 8, 2024

LLM in a Flash: improving memory requirements of large language models

January 2, 2024

What is Retrieval-Augmented Generation?

December 27, 2023

Building a zero-shot image classifier with CLIP and HuggingFace Transformers

December 27, 2023

In-Context Learning: what it is and how it works

December 22, 2023

CLIP: how it works, how it's trained and how to use it

Article tags

fit

least squares

linear regression

machine learning

ordinary least squares

python

regression

scikit learn

Connect on social media

Connect with me on LinkedIn

To get in touch with me, please connect with me on LinkedIn. Make sure to write me a message saying hi!

See my work on GitHub

My work is available on GitHub. Feel free to check it out and see if it can be of use to you!

Side info

The content on this website is written for educational purposes. In writing the articles, I have attempted to be as correct and precise as possible. Should you find any errors, please let me know by creating an issue or pull request in this GitHub repository.

All text on this website written by me is copyrighted and may not be used without prior permission. Creating citations using content from this website is allowed if a reference is added, including an URL reference to the referenced article.

If you have any questions or remarks, feel free to get in touch.

TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.

PyTorch, the PyTorch logo and any related marks are trademarks of The Linux Foundation.

Montserrat and Source Sans are fonts licensed under the SIL Open Font License version 1.1.

Mathjax is licensed under the Apache License, Version 2.0.

Performing Linear Regression with Python and Scikit-learn

December 10, 2020 by Chris

Introducing Linear Regression

What is Regression?

Hi, I'm Chris!

I know a thing or two about AI and machine learning. Welcome to MachineCurve.com, where machine learning is explained in gentle terms.

Getting started

Foundation models

Keras

TensorFlow

PyTorch

Machine learning theory

Transformer architectures

Most recent articles

January 8, 2024

LLM in a Flash: improving memory requirements of large language models

January 2, 2024

What is Retrieval-Augmented Generation?

December 27, 2023

Building a zero-shot image classifier with CLIP and HuggingFace Transformers

December 27, 2023

In-Context Learning: what it is and how it works

December 22, 2023

CLIP: how it works, how it's trained and how to use it

Article tags

Most popular articles

February 18, 2020

How to use K-fold Cross Validation with TensorFlow 2 and Keras?

December 28, 2020

Introduction to Transformers in Machine Learning

December 27, 2021

StyleGAN, a step-by-step introduction

July 17, 2019

This Person Does Not Exist - how does it work?

October 26, 2020

Your First Machine Learning Project with TensorFlow 2.0 and Keras

Connect on social media

Connect with me on LinkedIn

See my work on GitHub

Side info

Getting started

Foundation models

Keras

TensorFlow

PyTorch

Machine learning theory

Transformer architectures

Most popular articles

February 18, 2020

How to use K-fold Cross Validation with TensorFlow 2 and Keras?

December 28, 2020

Introduction to Transformers in Machine Learning

December 27, 2021

StyleGAN, a step-by-step introduction

July 17, 2019

This Person Does Not Exist - how does it work?

October 26, 2020

Your First Machine Learning Project with TensorFlow 2.0 and Keras

Side info

Connect with me on LinkedIn

See my work on GitHub