A cell contains “Invoice #4521 – March” and you need just the number. Or “PO-2025-0847-A” and you need the four-digit code in the middle. Or an address where you need to isolate the zip code from the end. Extracting specific parts from mixed text strings is a daily task for anyone working with imported data. Here are the key techniques.
The Data
| A | B | |
|---|---|---|
| 1 | Raw Text | Need to Extract |
| 2 | Invoice #4521 – March | 4521 |
| 3 | PO-2025-0847-A | 0847 |
| 4 | SKU: WDG-1200-BLK | WDG-1200-BLK |
| 5 | 123 Main St, Tampa FL 33612 | 33612 |
| 6 | Order 789 (Rush) | 789 |
LEFT, RIGHT, MID — Position-Based Extraction
When the text you need is always in the same position:
First N characters
=LEFT(A2, 5)
“Invoice #4521 – March” → “Invoi” (first 5 characters)
Last N characters
=RIGHT(A5, 5)
“123 Main St, Tampa FL 33612” → “33612” (last 5 characters — perfect for zip codes)
Characters from the middle
=MID(A3, 9, 4)
“PO-2025-0847-A” → “0847” (4 characters starting at position 9)
FIND + MID — Extracting After a Delimiter
When the position varies but there’s a consistent marker (like “#” or “:”):
Extract everything after “#”
=MID(A2, FIND("#",A2)+1, FIND(" ",A2,FIND("#",A2))-FIND("#",A2)-1)
“Invoice #4521 – March” → “4521”. Finds “#”, starts one character after it, and reads until the next space.
Extract everything after “: “
=MID(A4, FIND(": ",A4)+2, 100)
“SKU: WDG-1200-BLK” → “WDG-1200-BLK”. Finds “: ” and takes everything after it.
Extract Only Numbers from Mixed Text
For pulling just the numeric digits out of text like “Order 789 (Rush)”:
Excel 365 — TEXTJOIN + MID array
=TEXTJOIN("",TRUE,IF(ISNUMBER(MID(A6,ROW(INDIRECT("1:"&LEN(A6))),1)*1),MID(A6,ROW(INDIRECT("1:"&LEN(A6))),1),""))
Checks each character — if it’s a number, keeps it; if not, discards it. “Order 789 (Rush)” → “789”. Enter with Ctrl+Shift+Enter in older Excel versions.
Complex formula alert
The “extract only numbers” formula is powerful but hard to read and maintain. For a one-time cleanup it works. For recurring use or if non-technical users need to understand the workbook, a VBA function or Flash Fill is usually more practical.
Flash Fill — The Quick Way
For one-time extractions where you can show Excel the pattern:
1In B2, manually type the extracted value: 4521
2Move to B3 and press Ctrl+E
3Excel detects the pattern and fills the rest of the column
Flash Fill limitations
Flash Fill is fast but fragile. It doesn’t create a formula — it fills static values. If the source data changes, you have to re-run Flash Fill. Also, it sometimes misidentifies the pattern when data is inconsistent, so always spot-check.
TEXTBEFORE and TEXTAFTER (Excel 365)
Microsoft 365 added these functions that make text extraction dramatically simpler:
=TEXTAFTER(A4, ": ")
“SKU: WDG-1200-BLK” → “WDG-1200-BLK”. Extracts everything after the delimiter. One function instead of the FIND+MID combination.
=TEXTBEFORE(A2, " - ")
“Invoice #4521 – March” → “Invoice #4521”. Extracts everything before the delimiter.
When Formulas Aren’t Enough
Text parsing with formulas works for consistent patterns. But real data is rarely consistent:
- Variable formats — some rows have “Invoice #4521” while others have “INV-4521” or just “4521”
- Multiple extraction rules — different columns need different parsing logic
- Pattern recognition — identifying and extracting phone numbers, emails, or postal codes from free-text fields
- Thousands of records — complex array formulas slow the workbook to a crawl on large datasets
- Recurring imports — new data arrives weekly with the same parsing needs
A custom VBA text parser applies pattern matching (including regular expressions), handles format variations, processes thousands of rows in seconds, and can be reused on every import.
“He made suggestions to make the program better and as a result the project came out better than I had imagined.”
donovaneric · Freelancer
Parsing Messy Text Data from Imports?
Tell us about your data and we’ll build a parser that handles every variation.
Start a Project