Skip to content

Commit

Permalink
docs: add upgrading guide to v03 (#464)
Browse files Browse the repository at this point in the history
### Description

- Add upgrading guide from v0.2.x to 0.3.0.
- There were 2 breaking changes:
  - declaration of public & private interface,
  - request queue v2.

### Issues

- N/A

### Testing

- Website was rendered locally.

### Checklist

- [x] CI passed

---------

Co-authored-by: Jan Buchar <[email protected]>
  • Loading branch information
vdusek and janbuchar authored Aug 27, 2024
1 parent e35f25c commit 1c8d3f1
Show file tree
Hide file tree
Showing 2 changed files with 54 additions and 16 deletions.
38 changes: 38 additions & 0 deletions docs/upgrading/upgrading_to_v03.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
id: upgrading-to-v03
title: Upgrading to v0.3
---

This page summarizes most of the breaking changes between Crawlee for Python v0.2.x and v0.3.0.

## Public and private interface declaration

In previous versions, the majority of the package was fully public, including many elements intended for internal use only. With the release of v0.3, we have clearly defined the public and private interface of the package. As a result, some imports have been updated (see below). If you are importing something now designated as private, we recommend reconsidering its use or discussing your use case with us in the discussions/issues.

Here is a list of the updated public imports:

```diff
- from crawlee.enqueue_strategy import EnqueueStrategy
+ from crawlee import EnqueueStrategy
```

```diff
- from crawlee.models import Request
+ from crawlee import Request
```

```diff
- from crawlee.basic_crawler import Router
+ from crawlee.router import Router
```

## Request queue

There were internal changes that should not affect the intended usage:

- The unused `BaseRequestQueueClient.list_requests()` method was removed
- `RequestQueue` internals were updated to match the "Request Queue V2" implementation in Crawlee for JS

## Service container

A new module, `crawlee.service_container`, was added to allow management of "global instances" - currently it contains `Configuration`, `EventManager` and `BaseStorageClient`. The module also replaces the `StorageClientManager` static class. It is likely that its interface will change in the future. If your use case requires working with it, please get in touch - we'll be glad to hear any feedback.
32 changes: 16 additions & 16 deletions website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -103,22 +103,22 @@ module.exports = {
// },
// ],
// },
// {
// type: 'category',
// label: 'Upgrading',
// link: {
// type: 'generated-index',
// title: 'Upgrading',
// slug: '/upgrading',
// keywords: ['upgrading'],
// },
// items: [
// {
// type: 'autogenerated',
// dirName: 'upgrading',
// },
// ],
// },
{
type: 'category',
label: 'Upgrading',
link: {
type: 'generated-index',
title: 'Upgrading',
slug: '/upgrading',
keywords: ['upgrading'],
},
items: [
{
type: 'autogenerated',
dirName: 'upgrading',
},
],
},
{
type: 'doc',
label: 'Changelog',
Expand Down

0 comments on commit 1c8d3f1

Please sign in to comment.