Hacker News new | past | comments | ask | show | jobs | submit login

It is a major change to ask but I am wondering if they consider to switch to asyncio instead of using Twisted. Twisted is great library but it is a huge dependency to maintain.



We're seriously considering this option (maybe for scrapy 2.0?), but no concrete plans yet. It can't be just asyncio - we'll need e.g. aiohttp and other packages.

Some links:

* POC for aiohttp as a http handler: https://github.com/scrapy/scrapy/pull/1455

* some thought about how to make async/await API for Scrapy; it is not all roses: https://github.com/scrapy/scrapy/issues/1144#issuecomment-14...


There has been some discussions and a proof-of-concept with asyncio + aiohttp: https://github.com/scrapy/scrapy/pull/1455

However, as you said, it'd be a major change and it would affect the whole ecosystem (plugins and extensions), so it's complicated. We'll see what happens. :)


Can you elaborate? What makes it a "huge dependency to maintain"? Is there anything that the Twisted project can do to make it easier? If this is actually a problem I'd really like to hear from users on the Twisted mailing list and bug tracker.


Twisted is a general purpose library/framework with lots of features. This is the "huge" part. In my previous projects I have used it a lot and appreciated it.

What I was trying to tell is if Scrapy uses only small part of library, it may be possible for developers to use similar constructs from Python's standard library. In any case dependency is dependency and it is always better to minimize code footprint.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: