Skip to content

Releases: apify/crawlee

v3.11.5

04 Oct 12:36
Compare
Choose a tag to compare

3.11.5 (2024-10-04)

Bug Fixes

  • core: fix forefront request fetching in RQv2 (#2689) (03951bd), closes #2669
  • core: respect forefront option in prolong- and deleteRequestLock (#2690) (cba8da3), closes #2681 #2689 #2669
  • core: check .isFinished() before RequestList reads (#2695) (6fa170f)
  • core: accept UInt8Array in KVS.setValue() (#2682) (8ef0e60)
  • core: trigger errorHandler for session errors (#2683) (7d72bcb), closes #2678
  • core: decode special characters in proxy username and password (#2696) (0f0fcc5)
  • http-crawler: avoid crashing when gotOptions.cache is on (#2686) (1106d3a)
  • puppeteer: rename ignoreHTTPSErrors to acceptInsecureCerts to support v23 (#2684) (f3927e6)
  • memory-storage: respect forefront option in RequestQueue (#2681) (b0527f9), closes #2669

v3.11.4

23 Sep 08:15
Compare
Choose a tag to compare

3.11.4 (2024-09-23)

Bug Fixes

  • SitemapRequestList.teardown() doesn't break persistState calls (#2673) (fb2c5cd), closes #2672

v3.11.3

03 Sep 15:12
Compare
Choose a tag to compare

3.11.3 (2024-09-03)

Bug Fixes

  • improve FACEBOOK_REGEX to match older style page URLs (#2650) (a005e69), closes #2216
  • RequestQueueV2: reset recently handled cache too if the queue is pending for too long (#2656) (51a69bc)

v3.11.2

28 Aug 12:15
Compare
Choose a tag to compare

3.11.2 (2024-08-28)

Bug Fixes

  • RequestQueueV2: remove inProgress cache, rely solely on locked states (#2601) (57fcb08)
  • use namespace imports for cheerio to be compatible with v1 (#2641) (f48296f)
  • Use the correct mutex in memory storage RequestQueueClient (#2623) (2fa8a29)

Features


This release is pinning the dependency on cheerio to the last RC version, we might postpone the official support for v1 to next major, or at least wait for them to fix their stuff. Nice demonstration of how not to maintain popular open source projects 😞

v3.11.1

24 Jul 11:06
Compare
Choose a tag to compare

3.11.1 (2024-07-24)

Bug Fixes

v3.11.0

09 Jul 13:30
Compare
Choose a tag to compare

3.11.0 (2024-07-09)

Features

  • add iframe expansion to parseWithCheerio in browsers (#2542) (328d085), closes #2507
  • add ignoreIframes opt-out from the Cheerio iframe expansion (#2562) (474a8dc)
  • Sitemap-based request list implementation (#2498) (7bf8f0b)

v3.10.5

12 Jun 08:42
Compare
Choose a tag to compare

3.10.5 (2024-06-12)

Bug Fixes

  • allow creating new adaptive crawler instance without any parameters (9b7f595)
  • declare missing peer dependencies in @crawlee/browser package (#2532) (3357c7f)
  • fix detection of HTTP site when using the useState in adaptive crawler (#2530) (7e195c1)
  • mark context.request.loadedUrl and id as required inside the request handler (#2531) (2b54660)

v3.10.4

11 Jun 15:06
Compare
Choose a tag to compare

3.10.4 (2024-06-11)

Bug Fixes

  • add waitForAllRequestsToBeAdded option to enqueueLinks helper (925546b), closes #2318
  • add missing useState implementation into crawling context (eec4a71)
  • make crawler.log publicly accessible (#2526) (3e9e665)
  • playwright: allow passing new context options in launchOptions on type level (0519d40), closes #1849
  • respect crawler.log when creating child logger for Statistics (0a0d75d), closes #2412

v3.10.3

07 Jun 11:53
Compare
Choose a tag to compare

3.10.3 (2024-06-07)

Bug Fixes

  • adaptive-crawler: log only once for the committed request handler execution (#2524) (533bd3f)
  • increase timeout for retiring inactive browsers (#2523) (195f176)
  • respect implicit router when no requestHandler is provided in AdaptiveCrawler (#2518) (31083aa)
  • revert the scaling steps back to 5% (5bf32f8)

Features

  • add waitForSelector context helper + parseWithCheerio in adaptive crawler (#2522) (6f88e73)
  • log desired concurrency in the default status message (9f0b796)

v3.10.2

03 Jun 09:07
Compare
Choose a tag to compare

3.10.2 (2024-06-03)

Bug Fixes

Features