How to Scrape DoorDash Listings, Stores, Restaurants, and More w/ Python

DoorDash moves millions of orders every day; groceries, fast food, alcohol, pet supplies, and everything in between.

For businesses working in delivery, retail intelligence, or market research, this data is gold.

But DoorDash doesn’t give up its data easily. It’s protected behind Cloudflare, locked behind backend address validation, and scattered across different endpoints depending on whether you’re scraping a local restaurant or a national chain.

But we didn’t craft this detailed guide for nothing :)

Find ALL WORKING CODE on this GitHub folder ⚙

In this guide, we’ll scrape restaurant listings, menus, catalogs, and more after setting up an address on the backend, using Scrape.do for bypassing Cloudflare protection, all in simple Python scripts.

Starting with the most crucial step:

Why is Scraping DoorDash Difficult?

From a business perspective, it makes sense to block scrapers; menus, delivery zones, pricing, and reviews are constantly changing and highly localized.

Opening up that data means opening the door to competitor tracking, shadow apps, and data resellers.

So they’ve built real barriers. Not simple obfuscation mind you; real protection.

Heavy Cloudflare Protection Against Bots

DoorDash sits behind one of the strictest Cloudflare configurations I’ve seen.

In addition to regular challenges, it checks for TLS fingerprint mismatches, header consistency, browser quirks, and most importantly IP reputation.

Cloudflare sits at the door of DoorDash like Cerberus.

Click to view full-size image

We’ll solve this challenge by using a scraping proxy like Scrape.do, which automatically handles TLS fingerprinting, header spoofing, and IP rotation behind the scenes for straight access.

You Need to Submit Your Address on the Backend

First thing DoorDash asks you when you pull up their homepage:

Click to view full-size image

Your address.

The platform won’t return any real store data unless you pick a full address (with street, city, state, and ZIP) on the frontend and then submit it through their backend.

When you enter an address on the homepage, it sends a backend request, validates the address, and attaches it to your active session.

That session (and its attached address) is required to fetch any listings, menus, or delivery options. Scrape.do will also come in extremely handy for sticky sessions.

In fact, that’s what we’ll start our guide with:

Adding Your Address to DoorDash via Backend

To scrape anything meaningful from DoorDash (restaurants, menus, catalogs), you first need to attach an address to your session.

But this isn’t handled via the URL or any visible parameters.

Instead, DoorDash sends a GraphQL mutation behind the scenes that registers the address to your session.

That’s what we’ll replicate now using:

requests to send the POST request
json to construct the payload
Scrape.do to handle Cloudflare, TLS, sticky sessions, and header spoofing

You’ll also need your Scrape.do API token. If you don’t have one yet, sign up here for 1000 free credits.

Step 1: Finding the GraphQL Mutation

When you enter an address on the DoorDash homepage, your browser sends a backend request to:

https://www.doordash.com/graphql/addConsumerAddressV2?operation=addConsumerAddressV2

This is a POST request; which means instead of just fetching data, it’s sending data to the server to register your address with your session.

The data sent by you through this POST request is called the payload it travels in the body of the request, not in the URL, which is why it’s a next-level concept for web scraping.

To view and copy the payload:

Right-click and Inspect before visiting the DoorDash homepage or any page of the website. Open the Network tab in the developer tools that pop up and make sure Preserve log option in the top toolbar is checked (so that requests are not cleared when the new page loads.)

Pick an address (preferably somewhere DoorDash is available in) and then save it.

Now, your browser will send the POST request we just talked about. Once the new page loads, search for addConsumerAddress… from the top toolbar. Here’s what you should be able to see:

Click to view full-size image

DoorDash uses a system called GraphQL, which always sends two things in the payload:

query: a big string that defines what operation to run (in this case, addConsumerAddressV2)
variables: a JSON object containing all the real values (latitude, longitude, city, zip, etc.)

Is understanding what you see on this screen very helpful for web scraping projects?

Yes.

Do you need to understand it to scrape DoorDash?

Nope.

So for now, just being able to access this payload is enough, now we’ll replicate this POST request one-to-one.

Step 2: Rebuilding the Request in Python

We’ll now send the exact same POST request that the browser sends, but from Python.

We do this by constructing a POST request with two components:

the target URL, which is the DoorDash GraphQL endpoint
the payload, which contains both the query string and the variables (your address data)

But instead of sending it directly, we’ll pass everything through Scrape.do to handle bypassing Cloudflare, spoof the headers and TLS fingerprint, and keep the session alive.

Here’s the full structure:

import requests
import json

# Scrape.do API token and target URL
TOKEN = "<your-token>"
TARGET_URL = "https://www.doordash.com/graphql/addConsumerAddressV2?operation=addConsumerAddressV2"

# Scrape.do API endpoint
api_url = (
    "http://api.scrape.do/?"
    f"token={TOKEN}"
    f"&super=true"
    f"&url={requests.utils.quote(TARGET_URL)}"
)

Then we define the payload we copied from what the browser sent (I’m adding the full payload here with Python dict, so bear with me):

payload = {
    "query": """
    mutation addConsumerAddressV2(
      $lat: Float!, $lng: Float!, $city: String!, $state: String!, $zipCode: String!,
      $printableAddress: String!, $shortname: String!, $googlePlaceId: String!,
      $subpremise: String, $driverInstructions: String, $dropoffOptionId: String,
      $manualLat: Float, $manualLng: Float, $addressLinkType: AddressLinkType,
      $buildingName: String, $entryCode: String, $personalAddressLabel: PersonalAddressLabelInput,
      $addressId: String
    ) {
      addConsumerAddressV2(
        lat: $lat, lng: $lng, city: $city, state: $state, zipCode: $zipCode,
        printableAddress: $printableAddress, shortname: $shortname, googlePlaceId: $googlePlaceId,
        subpremise: $subpremise, driverInstructions: $driverInstructions, dropoffOptionId: $dropoffOptionId,
        manualLat: $manualLat, manualLng: $manualLng, addressLinkType: $addressLinkType,
        buildingName: $buildingName, entryCode: $entryCode, personalAddressLabel: $personalAddressLabel,
        addressId: $addressId
      ) {
        defaultAddress {
          id
          addressId
          street
          city
          subpremise
          state
          zipCode
          country
          countryCode
          lat
          lng
          districtId
          manualLat
          manualLng
          timezone
          shortname
          printableAddress
          driverInstructions
          buildingName
          entryCode
          addressLinkType
          formattedAddressSegmentedList
          formattedAddressSegmentedNonUserEditableFieldsList
          __typename
        }
        availableAddresses {
          id
          addressId
          street
          city
          subpremise
          state
          zipCode
          country
          countryCode
          lat
          lng
          districtId
          manualLat
          manualLng
          timezone
          shortname
          printableAddress
          driverInstructions
          buildingName
          entryCode
          addressLinkType
          formattedAddressSegmentedList
          formattedAddressSegmentedNonUserEditableFieldsList
          __typename
        }
        id
        userId
        timezone
        firstName
        lastName
        email
        marketId
        phoneNumber
        defaultCountry
        isGuest
        scheduledDeliveryTime
        __typename
      }
    }
    """,
    "variables": {
        "googlePlaceId": "D000PIWKXDWA",
        "printableAddress": "99 S Broadway, Saratoga Springs, NY 12866, USA",
        "lat": 43.065749988891184,
        "lng": -73.79078001715243,
        "city": "Saratoga Springs",
        "state": "NY",
        "zipCode": "12866",
        "shortname": "National Museum Of Dance",
        "addressId": "1472738929",
        "subpremise": "",
        "driverInstructions": "",
        "dropoffOptionId": "2",
        "addressLinkType": "ADDRESS_LINK_TYPE_UNSPECIFIED",
        "entryCode": ""
    }
}

Then encode the payload data and send the request.

From the response, only thing we’re interested in is the scrape.do-rid response header so we print that out:

response = requests.post(api_url, data=json.dumps(payload))

scrape_do_rid = response.headers.get("scrape.do-rid")
print(f"scrape.do-rid: {scrape_do_rid}")

scrape.do-rid value identifies your session that’s created in Scrape.do’s cloud, and if the post request was successful this session will have the address we sent through the payload registered, giving us access to DoorDash.

The response should look like this:

scrape.do-rid: 4f699c-40-0-459851

To keep using this session, we only need the last 6 digits (e.g. 361979) and we’ll need to carry that forward in every future request.

⚠ Scrape.do sessions will remain active while your requests are successful, but they will eventually shut down or DoorDash will flag your session as spam and block it out. Make sure to renew sessions on a regular basis to stay unblocked.

Scrape All DoorDash Restaurants and Stores for a Location

Using our session with an address registered, we can scrape all stores available for that location.

DoorDash exposes this through, again, a GraphQL endpoint called homePageFacetFeed.

This is what powers the homepage restaurant feed after you select an address.

Our goal in this section is to:

Send a POST request to homePageFacetFeed using our existing session ID to get first batch of stores
Save the raw JSON response (which contains dozens of storefronts) to a local file
To get all remaining stores edit the cursor variable and loop through all batches
Finally, parse relevant store information and export all details to a CSV document

💡 Open and available stores and restaurants change during the day or different days of the week, to make sure you get all the data you might want to run your scraper hourly/daily during the week.

Get Query Payload for `homePageFacetFeed` Request

This is a step you’re familiar with.

To get the payload we’ll use in our script, make sure the address set in your browser is the same one you used to generate the session ID from addConsumerAddress.

Then:

Visit the DoorDash homepage (or refresh it)
Open DevTools → Network tab
Search for homePageFacetFeed

From there, extract the query and variables fields from the Payload tab just like we did earlier. We’ll use this in our next (and every request we make to this API endpoint).

Build Request and Print First Batch of Stores

Now let’s replicate the homePageFacetFeed request in Python.

We’ll send the same query and variables you copied earlier from DevTools — but route it through Scrape.do using the session ID from your addConsumerAddress step. This session ID keeps our context intact, so DoorDash knows which address we’re trying to fetch store data for.

Start by importing the necessary libraries and defining the basic config. You’ll need your Scrape.do token, the session ID (the last 6 digits from the previous step), and the target GraphQL endpoint:

import requests
import json

TOKEN = "<your-token>"
SESSION_ID = "<session-id>"  # e.g. "459851"
TARGET_URL = "https://www.doordash.com/graphql/homePageFacetFeed?operation=homePageFacetFeed"

API_URL = f"http://api.scrape.do/?token={TOKEN}&super=true&url={TARGET_URL}&sessionId={SESSION_ID}"

Now, take the full GraphQL query string and the variable set you copied from your browser, and plug them in as the request payload:

payload = {
    "query": "{full-payload-from-prev-step}",
    "variables": {
        "cursor": "eyJvZmZzZXQiOjAsInZlcnRpY2FsX2lkcyI6WzEwMDMzMywzLDIsMyw3MCwxMDMsMTM5LDE0NiwxMzYsMjM1LDI2OCwyNDEsMjM2LDIzOSw0LDIzOCwyNDMsMjgyXSwicm9zc192ZXJ0aWNhbF9wYWdlX3R5cGUiOiJIT01FUEFHRSIsInBhZ2Vfc3RhY2tfdHJhY2UiOltdLCJsYXlvdXRfb3ZlcnJpZGUiOiJVTlNQRUNJRklFRCIsImlzX3BhZ2luYXRpb25fZmFsbGJhY2siOm51bGwsInNvdXJjZV9wYWdlX3R5cGUiOm51bGwsInZlcnRpY2FsX25hbWVzIjp7fX0=",
        "filterQuery": "",
        "displayHeader": True,
        "isDebug": False
    }
}

Finally, send the request.

We’re not going to parse or print anything yet; instead, we’ll write the entire response to a file so we can make sure we’re getting stores in our response:

response = requests.post(API_URL, data=json.dumps(payload))
response.raise_for_status()
data = response.json()

with open("restaurants_first_batch.json", "w", encoding="utf-8") as f:
    json.dump(data, f, ensure_ascii=False, indent=2)

print("Saved first page response to restaurants_first_batch.json")

The result must be a full JSON dump of the first batch of restaurants and storefronts DoorDash returned for your location, over 20K lines of JSON entries.

To make sure, check for the top restaurants that you see on your browser and see if they’re somewhere inside the JSON dump:

Click to view full-size image

Once you’re positive you’re hitting the right API endpoint with right payload and session ID, it’s time to go for everything.

Looping Through Multiple Store Batches with Cursor-Based Pagination

There might be only 50 restaurants available for your address, which the request above will easily be able to fetch the entirety of.

But if there are more than a 100 stores, DoorDash only loads a certain amount with the first call.

For the next batch(es) of restaurants and stores, it instead changes the cursor variable from the payload we copied before to tell the server to start the list from where it left in the request before.

This is called cursor-based pagination, and it’s very common in APIs powered by GraphQL.

💡 The cursor string is actually a base64-encoded JSON that has "offset": 0 and a few other variables in it that we can easily decode, change data, and encode again to send and get a successful response, but; the response already tells us what the next cursor is if there is any, so we can skip this step.

We’ll now write a full loop that:

sends the request with our cursor
parses the response
stores the results
extracts the next cursor
and repeats until there’s nothing left

Let’s start from the top of our script with our setup:

import requests
import json
import csv


TOKEN = "<your-token>"
TARGET_URL = "https://www.doordash.com/graphql/homePageFacetFeed?operation=homePageFacetFeed"
SESSION_ID = "<session-id>"
API_URL = f"http://api.scrape.do/?token={TOKEN}&super=true&url={TARGET_URL}&sessionId={SESSION_ID}"

Then paste only the query string from the homePageFacetFeed request you copied earlier, we’ll include variables later:

QUERY = """query homePageFacetFeed($cursor: String, $filterQuery: String, $displayHeader: Boolean, $isDebug: Boolean, $cuisineFilterVerticalIds: String) {
  homePageFacetFeed(cursor: $cursor filterQuery: $filterQuery displayHeader: $displayHeader isDebug: $isDebug cuisineFilterVerticalIds: $cuisineFilterVerticalIds) {
    ...FacetFeedV2ResultFragment
    __typename
  }
}
fragment FacetFeedV2ResultFragment on FacetFeedV2Result {
  body {
    id
    header { ...FacetV2Fragment __typename }
    body { ...FacetV2Fragment __typename }
    layout { omitFooter __typename }
    __typename
  }
  page { ...FacetV2PageFragment __typename }
  header { ...FacetV2Fragment __typename }
  footer { ...FacetV2Fragment __typename }
  custom
  logging
  __typename
}
... (remaining fragments unchanged) ...
"""

That query tells DoorDash what kind of layout we want returned. It’s long and bloated, but we’ll only focus on one section from the response.

Now, let’s create the main scraping logic:

def main():
    # Initial cursor for the first page of results
    initial_cursor = "eyJvZmZzZXQiOjAsInZlcnRpY2FsX2lkcyI6WzEwMDMzMywzLDIsMyw3MCwxMDMsMTM5LDE0NiwxMzYsMjM1LDI2OCwyNDEsMjM2LDIzOSw0LDIzOCwyNDMsMjgyXSwicm9zc192ZXJ0aWNhbF9wYWdlX3R5cGUiOiJIT01FUEFHRSIsInBhZ2Vfc3RhY2tfdHJhY2UiOltdLCJsYXlvdXRfb3ZlcnJpZGUiOiJVTlNQRUNJRklFRCIsImlzX3BhZ2luYXRpb25fZmFsbGJhY2siOm51bGwsInNvdXJjZV9wYWdlX3R5cGUiOm51bGwsInZlcnRpY2FsX25hbWVzIjp7fX0="
    cursor = initial_cursor
    page_num = 1
    count = 0
    rows = []

That initial_cursor value is the exact value we saw in the original payload when visiting the homepage after setting our address. This tells DoorDash which section of the listing to return first.

Let’s now create our loop:

    while True:
        payload = {
            "query": QUERY,
            "variables": {
                "cursor": cursor,
                "filterQuery": "",
                "displayHeader": True,
                "isDebug": False
            }
        }

        print(f"Requesting page {page_num}...")

        try:
            response = requests.post(API_URL, data=json.dumps(payload))
            response.raise_for_status()
            data = response.json()
        except Exception as e:
            print(f"Request or JSON decode failed: {e}")
            break

Here we’re sending the same query on every loop, but with a different cursor. We use Scrape.do again as a proxy to avoid Cloudflare blocks, and parse the JSON like we did before.

Now we need to find the "store_feed" section inside the homepage body:

        home_feed = data.get("data", {}).get("homePageFacetFeed", {})
        sections = home_feed.get("body", [])
        store_feed = next((s for s in sections if s.get("id") == "store_feed"), None)
        if not store_feed:
            print("No store_feed section found!")
            break

If everything is successful, we now have the current batch of restaurant listings inside store_feed["body"].

We haven’t parsed any data yet and we’ll do that in the next section. But for now let’s just focus on cursor handling.

After each batch, we check for the next cursor:

        page_info = home_feed.get("page", {})
        next_cursor = None

        if page_info.get("next") and page_info["next"].get("data"):
            try:
                next_cursor = json.loads(page_info["next"]["data"]).get("cursor")
            except Exception as e:
                print(f"Failed to parse next cursor: {e}")
                break

        if not next_cursor:
            break

        cursor = next_cursor
        page_num += 1

That finishes our pagination loop and we’re now walking through every page DoorDash sends.

Parsing Each Store Entry into Structured Rows

Now let’s focus on what we’re actually pulling out of each page.

Each response contains a section called "store_feed" buried inside the "homePageFacetFeed" → "body" array. This is where DoorDash puts the real store listings.

We start by locating it:

home_feed = data.get("data", {}).get("homePageFacetFeed", {})
sections = home_feed.get("body", [])
store_feed = next((s for s in sections if s.get("id") == "store_feed"), None)

This gives us the part of the response with all the individual store entries.

Now we iterate over the rows inside that store feed.

But we don’t want to collect everything; we only want rows that represent real DoorDash stores.

We filter those with a simple check:

for entry in store_feed.get("body", []):
    if not entry.get("id", "").startswith("row.store:"):
        continue

Once we know we’ve got a valid store, we start parsing.

Each store entry is a big dictionary containing text, custom, and logging sections.

These hold info like store name, description, delivery time, and so on but some of them are already decoded dictionaries, and others are still JSON strings. So we use helpers to normalize them:

text = entry.get("text", {})
custom = extract_custom(entry)    # Might be a dict or a stringified JSON
logging = extract_logging(entry)  # Same here

With the text and metadata normalized, we construct a row:

row = [
    text.get("title", "N/A"),  # Store name
    text.get("description", ""),  # Store description
    extract_custom_value(text.get("custom", []), "delivery_fee_string"),  # Delivery fee
    extract_custom_value(text.get("custom", []), "eta_display_string"),  # ETA
    custom.get("is_currently_available"),  # Open now
    custom.get("rating", {}).get("average_rating"),  # Average rating
    custom.get("rating", {}).get("display_num_ratings"),  # Number of ratings
    logging.get("price_range"),  # Price range
    logging.get("store_distance_in_miles"),  # Distance in miles
    custom.get("store_id") or logging.get("store_id"),  # Store ID (fallback)
    extract_link(entry)  # Constructed from click event
]

Each of these fields corresponds to visible data on the DoorDash frontend, the info you’ll see when browsing.

We append each row into the rows list:

rows.append(row)
count += 1

And now our list contains every parsed store from every page.

Writing Everything Out to a CSV File

Now that we’ve collected all the store data into a list of rows, we just need to dump it into a CSV.

We’ll add this small section right after the QUERY bit to determine headers of our CSV file:

header = [
    "Name", "Description", "Delivery Fee", "ETA", "Open Now", "Average Rating", "Number of Ratings", "Price Range", "Distance (mi)", "Store ID", "Link"
]

And we’ll make our code create and write to the CSV file and let us know in the terminal how many stores were listed:

with open("doordash_restaurant_listings.csv", "w", encoding="utf-8", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(header)
    writer.writerows(rows)
print(f"Saved {count} stores to doordash_restaurant_listings.csv")

There are a few additions that we still have to make:

Complete the Code and Output

The bigger chunks of our scraper is done, some small parts that make it better we still need to add.

When starting our main function, we’ll initialize a few variables:

def main():
    initial_cursor = "<cursor-goes-here>"
    cursor = initial_cursor
    page_num = 1           # For logging page progress
    count = 0              # Track total store count
    rows = []              # Final results container

We’re using page_num to track how many pages we’ve walked through, and we print it in each loop like this:

# Add right after we close the payload variables and before the try except statements where we send our POST requests
print(f"Requesting page {page_num}...")

And for the store ID, DoorDash sometimes puts it under custom["store_id"], sometimes under logging["store_id"]. So we use a fallback logic:

custom.get("store_id") or logging.get("store_id")

When all of these (and maybe a few more lines I’ve missed) are added, here’s what the full code looks like (with the payload omitted)

# Script to scrape DoorDash restaurant listings using the GraphQL API and save to CSV
import requests
import json
import csv

# API authentication and endpoint setup
TOKEN = "<your-token>"
TARGET_URL = "https://www.doordash.com/graphql/homePageFacetFeed?operation=homePageFacetFeed"
SESSION_ID = "<session-id>"
API_URL = f"http://api.scrape.do/?token={TOKEN}&super=true&url={TARGET_URL}&sessionId={SESSION_ID}"

# GraphQL query for fetching the home page facet feed
QUERY = "<full-query-payload>"

# CSV header for output file
header = [
    "Name", "Description", "Delivery Fee", "ETA", "Open Now", "Average Rating", "Number of Ratings", "Price Range", "Distance (mi)", "Store ID", "Link"
]

# Helper to parse 'custom' field from entry, which may be a JSON string or dict
def extract_custom(entry):
    val = entry.get("custom", "{}")
    return json.loads(val) if isinstance(val, str) else val

# Helper to parse 'logging' field from entry, which may be a JSON string or dict
def extract_logging(entry):
    val = entry.get("logging", "{}")
    return json.loads(val) if isinstance(val, str) else val

# Helper to extract the store link from the entry's click event
def extract_link(entry):
    events = entry.get("events", {})
    data = events.get("click", {}).get("data")
    if data:
        try:
            link_data = json.loads(data)
            return link_data.get("domain", "") + link_data.get("uri", "")
        except Exception:
            return None
    return None

# Helper to extract a value from a list of custom fields by key
def extract_custom_value(custom_list, key):
    for c in custom_list:
        if c.get("key") == key:
            return c.get("value", "")
    return ""

# Main scraping logic
def main():
    # Initial cursor for the first page of results
    initial_cursor = "eyJvZmZzZXQiOjAsInZlcnRpY2FsX2lkcyI6WzEwMDMzMywzLDIsMyw3MCwxMDMsMTM5LDE0NiwxMzYsMjM1LDI2OCwyNDEsMjM2LDIzOSw0LDIzOCwyNDMsMjgyXSwicm9zc192ZXJ0aWNhbF9wYWdlX3R5cGUiOiJIT01FUEFHRSIsInBhZ2Vfc3RhY2tfdHJhY2UiOltdLCJsYXlvdXRfb3ZlcnJpZGUiOiJVTlNQRUNJRklFRCIsImlzX3BhZ2luYXRpb25fZmFsbGJhY2siOm51bGwsInNvdXJjZV9wYWdlX3R5cGUiOm51bGwsInZlcnRpY2FsX25hbWVzIjp7fX0="
    cursor = initial_cursor
    page_num = 1
    count = 0
    rows = []

    # Paginate through all available pages
    while True:
        payload = {
            "query": QUERY,
            "variables": {
                "cursor": cursor,
                "filterQuery": "",
                "displayHeader": True,
                "isDebug": False
            }
        }
        print(f"Requesting page {page_num}...")
        try:
            response = requests.post(API_URL, data=json.dumps(payload))
            response.raise_for_status()
            data = response.json()
        except Exception as e:
            print(f"Request or JSON decode failed: {e}")
            break

        # Parse the main feed and find the section with store listings
        home_feed = data.get("data", {}).get("homePageFacetFeed", {})
        sections = home_feed.get("body", [])
        store_feed = next((s for s in sections if s.get("id") == "store_feed"), None)
        if not store_feed:
            print("No store_feed section found!")
            break
        # Extract each store row from the feed
        for entry in store_feed.get("body", []):
            if not entry.get("id", "").startswith("row.store:"):
                continue
            text = entry.get("text", {})
            custom = extract_custom(entry)
            logging = extract_logging(entry)
            row = [
                text.get("title", "N/A"),  # Store name
                text.get("description", ""),  # Store description
                extract_custom_value(text.get("custom", []), "delivery_fee_string"),  # Delivery fee
                extract_custom_value(text.get("custom", []), "eta_display_string"),  # ETA
                custom.get("is_currently_available"),  # Open now
                custom.get("rating", {}).get("average_rating"),  # Average rating
                custom.get("rating", {}).get("display_num_ratings"),  # Number of ratings
                logging.get("price_range"),  # Price range
                logging.get("store_distance_in_miles"),  # Distance in miles
                custom.get("store_id") or logging.get("store_id"),  # Store ID
                extract_link(entry)  # Store link
            ]
            rows.append(row)
            count += 1

        # Check for next page cursor
        page_info = home_feed.get("page", {})
        next_cursor = None
        if page_info.get("next") and page_info["next"].get("data"):
            try:
                next_cursor = json.loads(page_info["next"]["data"]).get("cursor")
            except Exception as e:
                print(f"Failed to parse next cursor: {e}")
                break
        if not next_cursor:
            break
        cursor = next_cursor
        page_num += 1

    # Write all collected rows to CSV file
    with open("doordash_restaurant_listings.csv", "w", encoding="utf-8", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(header)
        writer.writerows(rows)
    print(f"Saved {count} restaurants to doordash_restaurant_listings.csv")

if __name__ == "__main__":
    main()

And this is what your output CSV will look like:

Click to view full-size image

You’re now able to scrape restaurant listings wherever DoorDash is available!

Unlike the homepage listings, individual DoorDash store pages don’t require any session ID or payload crafting.

⚠ This applies to local and small stores/restaurants only, not chain stores.

But still, it’s a relief.

The data is already there just buried inside a giant JavaScript blob at the bottom of the page.

Let’s see step-by-step how to extract it.

We’ll start by fetching the page using Scrape.do (with super=true to bypass Cloudflare), and parse the HTML with BeautifulSoup.

We’re scraping the entire menu of Denny’s in Brooklyn:

import requests
import re
import csv
import json
from bs4 import BeautifulSoup

TOKEN = "<your-token>"
STORE_URL = "https://www.doordash.com/store/denny's-saratoga-springs-800933/28870947/"
API_URL = f"http://api.scrape.do/?token={TOKEN}&super=true&url={STORE_URL}"

response = requests.get(API_URL)
soup = BeautifulSoup(response.text, 'html.parser')

Now here’s the tricky part.

DoorDash uses a JavaScript variable named self.__next_f.push(...) to fill the frontend with embedded JSON. Inside that structure lives everything we care about; menus, categories, item names, images, prices, and more.

We locate that script with:

menu_script = next(
    script.string for script in soup.find_all('script')
    if script.string and 'self.__next_f.push' in script.string and 'itemLists' in script.string
)

We then extract the raw embedded string using a regex:

embedded_str = re.search(
    r'self\.__next_f\.push\(\[1,"(.*?)"\]\)',
    menu_script,
    re.DOTALL
).group(1).encode('utf-8').decode('unicode_escape')

This gives us a huge string of JSON-like content, but not in a form we can parse directly.

So we manually locate the "itemLists" array inside that string by finding where it starts with [ and where the matching ] ends:

start_idx = embedded_str.find('"itemLists":')
array_start = embedded_str.find('[', start_idx)
bracket_count = 0
for i in range(array_start, len(embedded_str)):
    if embedded_str[i] == '[':
        bracket_count += 1
    elif embedded_str[i] == ']':
        bracket_count -= 1
        if bracket_count == 0:
            array_end = i + 1
            break
itemlists_json = embedded_str[array_start:array_end].replace('\\u0026', '&')
itemlists = json.loads(itemlists_json)

Now we’ve got a proper list of categories, each with a list of menu items.

Let’s loop through them and collect everything:

all_items = []
for category in itemlists:
    for item in category.get('items', []):
        name = item.get('name')
        desc = item.get('description', '').strip() or None
        price = item.get('displayPrice')
        img = item.get('imageUrl')

Some items have a rating display string like 95% (132), which we’ll try to extract using regex:

        rating = review_count = None
        rds = item.get('ratingDisplayString')
        if rds:
            m2 = re.match(r'(\d+)%\s*\((\d+)\)', rds)
            if m2:
                rating = int(m2.group(1))
                review_count = int(m2.group(2))

We then store all this info into a clean dictionary:

        all_items.append({
            'name': name,
            'description': desc,
            'price': price,
            'rating_%': rating,
            'review_count': review_count,
            'image_url': img
        })

Finally, we write everything into a CSV file:

with open('menu_items.csv', 'w', newline='', encoding='utf-8') as csvfile:
    writer = csv.DictWriter(csvfile,
        fieldnames=['name','description','price','rating_%','review_count','image_url'])
    writer.writeheader()
    writer.writerows(all_items)

print(f"Extracted {len(all_items)} items to menu_items.csv")

And here’s what your output should look like:

Click to view full-size image

Scraping Products from DoorDash Categories (Grocery, Retail, etc.)

If you’re dealing with a large convenience store or retail chain on DoorDash like 7-Eleven, Walgreens, or Safeway, you won’t find a single screen that lists every item they sell.

There’s no all-in-one “store menu” like we had with restaurants.

Instead, DoorDash organizes retail products by categories such as drinks, snacks, medicine, frozen food, household supplies, and so on. And each of these is fetched individually from the backend using a different GraphQL query: categorySearch.

If you aim to get all the items a chainstore has available for a location you’ve picked, you’ll need to scrape each category and stitch them together.

Setting Up the GraphQL Query and Request Parameters

We’ll begin by setting up the essentials for sending a request to DoorDash’s category-level product API.

This is similar to what we did before, but this time the query targets the retailStoreCategoryFeed operation under the categorySearch endpoint.

We’re still using cursor-based pagination.

However, we need two additional values we’ll need to extract from a store’s URL.

When I select somewhere in Brooklyn as my address, I can view for example CVS, a chain grocery store in the US. And when I click on a the Drinks category, my URL becomes this:

https://www.doordash.com/convenience/store/1235954/category/drinks-751

I can extract what I need from here. storeId is 1235954 and categoryId is drinks-751.

Import the request libraries and paste in store and category values alongside your token, and session ID:

import requests
import json
import csv

TOKEN = "<your-token>"
TARGET_URL = "https://www.doordash.com/graphql/categorySearch?operation=categorySearch"
SESSION_ID = "<session-id>"
STORE_ID = "1235954"
CATEGORY_ID = "drinks-751"

API_URL = f"http://api.scrape.do/?token={TOKEN}&super=true&url={TARGET_URL}&sessionId={SESSION_ID}"

Then, closely observing the Network tab of the Developer Tools, find the categorySearch request and copy the query part of the payload.

Add the query string to your script, which will define what fields we want in return and which parameters we’ll be sending.

This one is slightly cleaner than the homepage feed, and we’re mostly interested in legoRetailItems, which contain the products:

QUERY = """query categorySearch($storeId: ID!, $categoryId: ID!, $subCategoryId: ID, $limit: Int, $cursor: String, $filterKeysList: [String!], $sortBysList: [RetailSortByOption!]!, $filterQuery: String, $aggregateStoreIds: [String!]) {
  retailStoreCategoryFeed(
    storeId: $storeId
    l1CategoryId: $categoryId
    l2CategoryId: $subCategoryId
    limit: $limit
    cursor: $cursor
    filterKeysList: $filterKeysList
    sortBysList: $sortBysList
    filterQuery: $filterQuery
    aggregateStoreIds: $aggregateStoreIds
  ) {
    legoRetailItems { custom }
    pageInfo { cursor hasNextPage }
  }
}"""

This tells DoorDash to return all retail items in the given category, along with pagination info so we can fetch the next batch later.

We’ll build the loop and parsing logic in the next step.

Looping Through Paginated Product Batches in a Category

Just like restaurant listings, DoorDash product listings are also paginated, especially for chains with hundreds or even thousands of items in a single category.

We’ll use a cursor-based loop to go through each batch of items.

The cursor value gets updated on each response so we can request the next page. We start with an empty cursor and continue as long as hasNextPage remains true.

Let’s initialize our main logic and loop:

def main():
    cursor = ""
    page_num = 1
    products = []

    while True:
        payload = {
            "query": QUERY,
            "variables": {
                "storeId": STORE_ID,
                "categoryId": CATEGORY_ID,
                "sortBysList": ["UNSPECIFIED"],
                "cursor": cursor,
                "limit": 500,
                "filterQuery": "",
                "filterKeysList": [],
                "aggregateStoreIds": []
            }
        }

        print(f"Requesting page {page_num}...")

        try:
            response = requests.post(API_URL, data=json.dumps(payload))
            response.raise_for_status()
            data = response.json()
        except Exception as e:
            print(f"Request or JSON decode failed: {e}")
            break

💡 In the original payload, the default limit variable is set as 50. But we changed it to 500 here to scrape 10X faster with 1/10 the amount of requests!

Here, we’re sending a new request on each loop with the current cursor. If DoorDash has more items, they’ll respond with a new cursor inside pageInfo.

Next, let’s grab the returned product batch and update the cursor if needed:

        feed = data.get("data", {}).get("retailStoreCategoryFeed", {})
        lego_items = feed.get("legoRetailItems", [])

        for facet in lego_items:
            prod = parse_product_fields(facet)
            if prod[0]:
                products.append(prod)

        page_info = feed.get("pageInfo", {})
        next_cursor = page_info.get("cursor")
        has_next = page_info.get("hasNextPage")

        if not has_next or not next_cursor or next_cursor == cursor:
            break

        cursor = next_cursor
        page_num += 1

We’ve walked through all available product batches for that specific category. Now we need a parsing logic:

Extracting and Structuring Product Fields

Each product returned in the legoRetailItems list is a deeply nested dictionary filled with encoded JSON, optional fields, and redundant keys.

To make sense of it, we’ll write a helper function called parse_product_fields() that:

Decodes the embedded JSON in the custom field.
Extracts key metadata like name, price, ratings, image, and description.
Returns a list of clean, human-readable values.

Let’s start by safely loading the JSON string:

def parse_product_fields(facet):
    try:
        custom = json.loads(facet.get('custom', '{}'))
    except Exception:
        custom = {}

From this decoded custom object, we pull a few key dictionaries:

    item_data = custom.get('item_data', {})
    price_name_info = custom.get('price_name_info', {}).get('default', {}).get('base', {})
    logging_info = custom.get('logging', {})
    image_url = custom.get('image', {}).get('remote', {}).get('uri', '')

These contain all the fields DoorDash uses to display products in its retail interface.

Now let’s extract and prioritize each value. We always prefer item_data first; if it’s missing, we fall back to price_name_info.

    name = item_data.get('item_name') or price_name_info.get('name')
    price = item_data.get('price', {}).get('display_string') or price_name_info.get('price', {}).get('default', {}).get('price')
    reviews_count = logging_info.get('item_num_of_reviews') or price_name_info.get('ratings', {}).get('count_of_reviews')
    reviews_avg = logging_info.get('item_star_rating') or price_name_info.get('ratings', {}).get('average')
    stock = item_data.get('stock_level') or logging_info.get('product_badges')
    image = image_url
    description = logging_info.get('description')

Finally, we return these values as a list, which our loop appends to the main products list:

    return [name, price, reviews_count, reviews_avg, stock, image, description]

With this function in place, each item in our CSV will be consistent, structured, and ready for downstream processing or analysis. Now let’s write the final part: exporting this data into a clean CSV file.

Exporting Products to CSV

Once we’ve looped through every product in the category and parsed its details, we’ll save everything into a CSV file just like we did before.

So here’s the complete code with the same CSV logic added:

import requests
import json
import csv

# Scrape.do API token and DoorDash GraphQL endpoint
TOKEN = "<your-token>"
TARGET_URL = "https://www.doordash.com/graphql/categorySearch?operation=categorySearch"
SESSION_ID = "<session-id>"
STORE_ID = "1235954"
CATEGORY_ID = "drinks-751"

# Build the Scrape.do API URL
API_URL = f"http://api.scrape.do/?token={TOKEN}&super=true&url={TARGET_URL}&sessionId={SESSION_ID}"

# GraphQL query for fetching category products
QUERY = """query categorySearch($storeId: ID!, $categoryId: ID!, $subCategoryId: ID, $limit: Int, $cursor: String, $filterKeysList: [String!], $sortBysList: [RetailSortByOption!]!, $filterQuery: String, $aggregateStoreIds: [String!]) { retailStoreCategoryFeed(storeId: $storeId l1CategoryId: $categoryId l2CategoryId: $subCategoryId limit: $limit cursor: $cursor filterKeysList: $filterKeysList sortBysList: $sortBysList filterQuery: $filterQuery aggregateStoreIds: $aggregateStoreIds) { legoRetailItems { custom } pageInfo { cursor hasNextPage } } }"""

# Helper to parse product fields from the GraphQL response
# Extracts name, price, reviews, stock, image, and description
def parse_product_fields(facet):
    try:
        custom = json.loads(facet.get('custom', '{}'))
    except Exception:
        custom = {}
    item_data = custom.get('item_data', {})
    price_name_info = custom.get('price_name_info', {}).get('default', {}).get('base', {})
    logging_info = custom.get('logging', {})
    image_url = custom.get('image', {}).get('remote', {}).get('uri', '')

    name = item_data.get('item_name') or price_name_info.get('name')
    price = item_data.get('price', {}).get('display_string') or price_name_info.get('price', {}).get('default', {}).get('price')
    reviews_count = logging_info.get('item_num_of_reviews') or price_name_info.get('ratings', {}).get('count_of_reviews')
    reviews_avg = logging_info.get('item_star_rating') or price_name_info.get('ratings', {}).get('average')
    stock = item_data.get('stock_level') or logging_info.get('product_badges')
    image = image_url
    description = logging_info.get('description')

    return [name, price, reviews_count, reviews_avg, stock, image, description]

# Main scraping logic
# Paginates through all category products and writes them to CSV
def main():
    cursor = ""
    page_num = 1
    products = []
    while True:
        # Build the GraphQL payload for the current page
        payload = {
            "query": QUERY,
            "variables": {
                "storeId": STORE_ID,
                "categoryId": CATEGORY_ID,
                "sortBysList": ["UNSPECIFIED"],
                "cursor": cursor,
                "limit": 500,
                "filterQuery": "",
                "filterKeysList": [],
                "aggregateStoreIds": []
            }
        }
        print(f"Requesting page {page_num}...")
        try:
            response = requests.post(API_URL, data=json.dumps(payload))
            response.raise_for_status()
            data = response.json()
        except Exception as e:
            print(f"Request or JSON decode failed: {e}")
            break
        # Parse the product feed and extract product data
        feed = data.get("data", {}).get("retailStoreCategoryFeed", {})
        lego_items = feed.get("legoRetailItems", [])
        for facet in lego_items:
            prod = parse_product_fields(facet)
            if prod[0]:
                products.append(prod)
        # Check for next page
        page_info = feed.get("pageInfo", {})
        next_cursor = page_info.get("cursor")
        has_next = page_info.get("hasNextPage")
        if not has_next or not next_cursor or next_cursor == cursor:
            break
        cursor = next_cursor
        page_num += 1
    # Write all products to CSV file
    with open("doordash_category_products.csv", 'w', encoding='utf-8', newline='') as f:
        writer = csv.writer(f)
        writer.writerow(['name', 'price', 'reviews_count', 'reviews_avg', 'stock', 'image_url', 'description'])
        writer.writerows(products)
    print(f"Extracted {len(products)} products to doordash_category_products.csv")

if __name__ == "__main__":
    main()

And here’s what the output looks like in the CSV:

Click to view full-size image

To Sum Up

DoorDash has one of the most complex and dynamic frontends of any delivery platform but as you’ve seen, that doesn’t mean it’s unscrapable.

By mimicking how DoorDash sets sessions, sends GraphQL mutations, and loads product listings in batches, you can extract local restaurant and store listings, entire menus from non-chain stores, and thousands of retail products by category from chain stores.

All with simple Python scripts.

And with Scrape.do, you don’t need to worry about:

TLS fingerprinting, session spoofing, or rotating premium proxies
Manual CAPTCHA solving or headless browser tricks
Fighting Cloudflare that breaks your scraper every week

Just send the request and Scrape.do takes care of the rest.

👉 Start scraping with 1000 free credits

How to Scrape DoorDash Listings, Stores, Restaurants, and More w/ Python

Why is Scraping DoorDash Difficult?

Heavy Cloudflare Protection Against Bots

You Need to Submit Your Address on the Backend

Adding Your Address to DoorDash via Backend

Step 1: Finding the GraphQL Mutation

Step 2: Rebuilding the Request in Python

Scrape All DoorDash Restaurants and Stores for a Location

Get Query Payload for `homePageFacetFeed` Request

Build Request and Print First Batch of Stores

Parsing Each Store Entry into Structured Rows

Writing Everything Out to a CSV File

Complete the Code and Output

Scraping Menu Items from a DoorDash Store/Restaurant Page

Scraping Products from DoorDash Categories (Grocery, Retail, etc.)

Setting Up the GraphQL Query and Request Parameters

Looping Through Paginated Product Batches in a Category

Extracting and Structuring Product Fields

Exporting Products to CSV

To Sum Up

R&D Engineer

How to Scrape DoorDash Listings, Stores, Restaurants, and More w/ Python

Why is Scraping DoorDash Difficult?

Heavy Cloudflare Protection Against Bots

You Need to Submit Your Address on the Backend

Adding Your Address to DoorDash via Backend

Step 1: Finding the GraphQL Mutation

Step 2: Rebuilding the Request in Python

Scrape All DoorDash Restaurants and Stores for a Location

Get Query Payload for homePageFacetFeed Request

Build Request and Print First Batch of Stores

Looping Through Multiple Store Batches with Cursor-Based Pagination

Parsing Each Store Entry into Structured Rows

Writing Everything Out to a CSV File

Complete the Code and Output

Scraping Menu Items from a DoorDash Store/Restaurant Page

Scraping Products from DoorDash Categories (Grocery, Retail, etc.)

Setting Up the GraphQL Query and Request Parameters

Looping Through Paginated Product Batches in a Category

Extracting and Structuring Product Fields

Exporting Products to CSV

To Sum Up

R&D Engineer

Get Query Payload for `homePageFacetFeed` Request