Format Scrapy Export

Here is how you can format Scrapy export of your items

For all of these options make sure that your spider actually yields some results in your parse method.

Using a known filename extension

Using one of the known file extensions that you provide in the -O option for the crawl command.

These are json, jsonlines, jl, csv, xml, marshal and pickle


Using the -t option

Provide the format in the -t option for the crawl command.

Accepted formats are the same (json, jsonlines, jl, csv, xml, marshal and pickle )

Using the FEEDS property of the spider

Provide a value to the FEEDS property in your spider

More info

Can be found in the official docs

import scrapy

class MySpider(scrapy.Spider):
    name = "myspider"
    custom_settings = {
        'FEEDS': {
            'export.jsonlines': {       # file URI (name)
                'format': 'jsonlines',  # json/jsonlines/csv/xml/marshal/pickle
            }
        }
    }

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.