Introduction

What Are Regular Expressions?

Some of the most common tasks in computing involve testing, analyzing, and modifying strings. For example, you may want to change names in a file from "First Last" format to "Last, First," or you may need to search for "Launch School" or "launchschool" in a document. This universe is home to regular expressions.

For simplicity, we'll use the term regex (pronounced "redj-ex") to refer to regular expressions. We use regex as both a singular and plural form; regexes is a bit of a mouthful.

Regex are patterns that we can use to search for information of interest in a set of strings. We can use them to perform conditional tests, extract desired information, or even modify information. If you've ever used wildcards on your computer, you can think of regex as the grown-up version of these simple search patterns.

We sometimes use the terms regex and pattern interchangeably; you'll find this elsewhere as well. For our purposes, we try to use the term "regex" to refer to the final form of a single regular expression used in a program, while "pattern" refers to the individual components that comprise the regex. However, the regex is sometimes called a pattern as well, and we sometimes do that too.

Regex are present all over the place. You'll find them in nearly every programming language that you use. This book concentrates on the Ruby, JavaScript, and Python implementations, but much of it applies regardless of which language or application you use.

What Are Regex Used For?

Regex have numerous uses; we've given you a mere sampling of the types of problems that you can solve with regex.

  • Check whether the pattern ss appears in the string Mississippi.
  • Check whether the letter i occurs three or more times in Mississippi.
  • Find and replace all instances of Mrs with Ms in a document.
  • Does a file name begin with Bob and end with .txt or .md?
  • Does a string have any non-alphanumeric characters in it?
  • Did the user enter a valid integer?
  • Are there any whitespace characters in the string?
  • Find and replace all non-alphanumeric characters in a string with -.
  • Locate all email addresses in a document.
  • Split a line of text into fields assuming that each whitespace character delimits two values.

This book will help you deal with these tasks by using regex.

A First Taste

At their most basic, regex are strings of characters between two / characters, e.g., /cat/. This regex matches the string cat anywhere it occurs in some text. For example, /cat/ matches cat, scat, catalog, and even scatter. It does not, however, match Cat or c a t.

Such simple patterns are useful, but you can typically find a more straightforward way to write your code. However, when you go a bit deeper, you run into regex like /\bhttps?:\/\/\S+\?/ which looks like gibberish. It is useful, though: it matches any web URL that contains a query string, regardless of whether the URL uses the HTTP or HTTPS scheme. This example hints at the expressiveness and power available with regex, and you will soon learn how to write them.

Do I Have to Know About Regex?

Technically no, but we expect you to know the basics to make your way through the Launch School curriculum. Everything that you can do with regex you can without them. However, that doesn't mean it is easy. There is an incredible amount of expressiveness in regex that is exceptionally difficult to emulate using your language's non-regex facilities.

The more accurate answer is that you must know the basics of regex, and that's what this book teaches. You don't need to be an expert, but you do need to recognize and understand regex when you see them. Knowing regex helps you solve complex pattern matching problems, even without knowing the more advanced details.

How Often Will I Use Regex?

That depends on the types of programs you work on in the future. If you never do any string processing -- a rare situation -- you won't use regex in your programs often. However, if you do any string processing at all, regex will help you do your job. The more string processing you do, the more you can expect to use regex.