Skip to content

Commit

Permalink
Update README
Browse files Browse the repository at this point in the history
  • Loading branch information
karwa committed Jan 21, 2022
1 parent 68b7daf commit 826f97c
Show file tree
Hide file tree
Showing 3 changed files with 108 additions and 63 deletions.
171 changes: 108 additions & 63 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,51 +1,48 @@
# WebURL
# **WebURL**

[![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2Fkarwa%2Fswift-url%2Fbadge%3Ftype%3Dswift-versions)](https://swiftpackageindex.com/karwa/swift-url)
[![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2Fkarwa%2Fswift-url%2Fbadge%3Ftype%3Dplatforms)](https://swiftpackageindex.com/karwa/swift-url)

A new URL type for Swift.

- **Compliant** with the [URL Living Standard](https://url.spec.whatwg.org/) for web compatibility. WebURL matches modern browsers and popular libraries in other languages.
- 🌍 **Compliant** with the [latest URL Standard](https://url.spec.whatwg.org/). WebURL matches how modern browsers interpret URLs.

- **Fast**. Tuned for high performance and low memory use.

- **Swifty**. The API makes liberal use of generics, in-place mutation, zero-cost abstractions, and other Swift features. It's a big step up from Foundation's `URL`.
- ⚡️ **Fast**. Tuned for high performance and low memory use.

- **Portable**. The core WebURL library has no dependencies other than the Swift standard library.
- 🍭 **Swifty**. The API makes liberal use of generics, in-place mutation, zero-cost abstractions, and other Swift features. It's a big step up from Foundation's `URL`.

- **Memory-safe**. WebURL uses carefully tuned bounds-checking techniques which the compiler is better able to reason about.
- 🧳 **Portable**. The core WebURL library has no dependencies other than the Swift standard library, and no platform-specific behavior.

And of course, it's written in **100% Swift**.
- 🥽 **Memory-safe**. WebURL uses carefully tuned bounds-checking techniques so it can be both fast _and_ safe.

> [**NEW**] The documentation has been entirely rewritten for DocC!
> It's hard to overstate what an enormous improvement it is - just check it out for yourself!
_(and of course, it's written in **100% Swift**)_.

The [Documentation](https://karwa.github.io/swift-url/main/documentation/weburl/) is the best place to learn about WebURL.
Note: The documentation at `/main/documentation/weburl` is built from the `main` branch, but you can also visit the stable documentation for a specific version (e.g. `/0.2.1/documentation/weburl`).
📚 Check out the [Documentation](https://karwa.github.io/swift-url/main/documentation/weburl/) to learn more 📚
<br/>
<br/>

# Using WebURL in your project

To use this package in a SwiftPM project, you need to set it up as a package dependency:

```swift
// swift-tools-version:5.3
import PackageDescription

let package = Package(
name: "MyPackage",
dependencies: [
.package(
url: "https://github.com/karwa/swift-url",
.upToNextMajor(from: "0.2.0") // or `.upToNextMinor`
)
],
targets: [
.target(
name: "MyTarget",
dependencies: [
.product(name: "WebURL", package: "swift-url")
]
)
]
)
// Add the package as a dependency.
dependencies: [
.package(
url: "https://github.com/karwa/swift-url",
.upToNextMinor(from: "0.3.0")
)
]

// Then add the WebURL library as a target dependency.
targets: [
.target(
name: "<Your target>",
dependencies: [
.product(name: "WebURL", package: "swift-url")
]
)
]
```

And with that, you're ready to start using `WebURL`:
Expand All @@ -63,36 +60,78 @@ url.pathComponents += ["apple", "swift"]
url // "https://github.com/apple/swift"
```

Visit [the documentation](https://karwa.github.io/swift-url/main/documentation/weburl/weburl/) for an overview of what you can do with `WebURL`.
📚 Check out the [Documentation](https://karwa.github.io/swift-url/main/documentation/weburl/) to learn about WebURL's API 📚
<br/>
<br/>

## 🔗 Integration with Foundation

WebURL 0.3.0 includes a library called `WebURLFoundationExtras`, which allows you to construct a `WebURL` from a Foundation `URL` value. To use it, add it to your target dependencies and import the module.

```swift
targets: [
.target(
name: "<Your target>",
dependencies: [
.product(name: "WebURL", package: "swift-url"),
// 👇 Add this line 👇
.product(name: "WebURLFoundationExtras", package: "swift-url")
]
)
]
```

Now you're able to accept URLs using a `Foundation.URL` value while taking advantage of WebURL's web-compatible normalization and fantastic API. Note that this can fail, because Foundation is quite loose about what it accepts as a "URL" and some ambiguous values aren't considered valid by the latest standard, but the things you expect to work will work :)

```swift
import Foundation
import WebURL
import WebURLFoundationExtras

public func processURL(_ url: Foundation.URL) throws {
guard let webURL = WebURL(url) else {
throw InvalidURLError()
}
// Continue processing using WebURL.
}
```

WebURL -> Foundation.URL conversion will be coming in a later version.
<br/>
<br/>

## Integration with swift-system
## 🔗 Integration with swift-system

WebURL 0.2.0 includes a library called `WebURLSystemExtras`, which integrates with `swift-system` and Apple's `System.framework`. This allows you to create `file:` URLs from `FilePath`s, and to create `FilePath`s from `file:` URLs. It supports both POSIX and Windows paths.
WebURL 0.2.0 includes a library called `WebURLSystemExtras`, which integrates with `swift-system` and Apple's `System.framework` and allows you to create `file:` URLs from `FilePath`s and vice versa. It has excellent support for both POSIX and Windows paths. Again, to use it, add the target dependency and import the module.

```swift
.target(
name: "MyTarget",
name: "<Your target>",
dependencies: [
.product(name: "WebURL", package: "swift-url"),
.product(name: "WebURLSystemExtras", package: "swift-url") // <--- Add this.
// 👇 Add this line 👇
.product(name: "WebURLSystemExtras", package: "swift-url")
]
)
```

And you're good to go!

```swift
import WebURL
import System
import WebURL
import WebURLSystemExtras

func openFile(at url: WebURL) throws -> FileDescriptor {
let path = try FilePath(url: url)
return try FileDescriptor.open(path, .readOnly)
}
```
<br/>

## Prototype port of async-http-client
## 🧪 async-http-client Port

We have a prototype port of [async-http-client](https://github.com/karwa/async-http-client), based on version 1.7.0 (the latest release as of writing), which uses WebURL for _all_ of its URL handling. It allows you to perform http(s) requests with WebURL, including support for HTTP/2, and is a useful demonstration of how to adopt WebURL in your library.
We have a prototype port of [async-http-client](https://github.com/karwa/async-http-client) which uses WebURL for _all_ of its internal URL handling. If you're using AHC in your server, check it out to take advantage of the latest URL standard and WebURL's improved API. By default, it takes advantage of WebURL's Foundation integration so you can make requests using either type, but it can also be built without any Foundation dependency at all - meaning smaller binaries and faster startup times. It's also a great demonstration of how to adopt WebURL in your library.

We'll be updating the port periodically, so if you wish to use it in an application we recommend making a fork and pulling in changes as you need.

Expand All @@ -112,19 +151,20 @@ func getTextFile(url: WebURL) throws -> EventLoopFuture<String?> {
let url = WebURL("https://github.com/karwa/swift-url/raw/main/README.md")!
try getTextFile(url: url).wait() // "# WebURL A new URL type for Swift..."
```
<br/>

# Project Status
# 📰 Project Status

WebURL is a complete URL library, implementing the latest version of the URL Standard (as of writing, that is the August 2021 review draft). It is tested against the [shared `web-platform-tests`](https://github.com/web-platform-tests/wpt/) used by major browsers, and passes all constructor and setter tests other than those which rely on IDNA. The library includes a comprehensive set of APIs for working with URLs: getting/setting basic components, percent-encoding/decoding, reading and writing path components, form parameters, file paths, etc. Each has their own extensive sets of tests in addition to the shared web-platform-tests.
WebURL is a complete URL library, implementing the latest version of the URL Standard (as of writing, that is commit `f787850`). It currently does not support Internationalized Domain Names (IDNA), but that support is planned.

The project is regularly benchmarked using the suite available in the `Benchmarks` directory and fuzz-tested using the fuzzers available in the `Fuzzers` directory.
It is tested against the [shared `web-platform-tests`](https://github.com/web-platform-tests/wpt/) used by major browsers, and passes all constructor and setter tests (other than those which require IDNA). The library includes a comprehensive set of APIs for working with URLs: getting/setting components, percent-encoding/decoding, reading and writing path components, form parameters, file paths, etc. Each has their own extensive sets of tests in addition to the shared web-platform-tests. The project is regularly benchmarked and fuzz-tested. The benchmark and fuzz-testing suite are available in the `Benchmarks` and `Fuzzers` directories respectively.

Being a pre-1.0 package, the interfaces have not had time to stabilize. If there's anything you think could be improved, your feedback is welcome - either open a GitHub issue or post to the [Swift forums](https://forums.swift.org/c/related-projects/weburl/73).

Prior to 1.0, it may be necessary to make source-breaking changes.
I'll do my best to keep these to a minimum, and any such changes will be accompanied by clear documentation explaining how to update your code.

## Roadmap
## 🗺 Roadmap

Aside from stabilizing the API, the other priorities for v1.0 are:

Expand All @@ -134,36 +174,39 @@ Aside from stabilizing the API, the other priorities for v1.0 are:

We will provide a compatibility library which allows these APIs to be used together with `WebURL`.

Looking beyond v1.0, the other features I'd like to add are:
2. More APIs for query parameters.

2. Better APIs for `data:` URLs.
A URL's `query` component is often used as a string of key-value pairs. This usage appears to have originated with HTML forms, and WebURL has excellent support for this via its `formParams` view, but popular convention is also to use keys and values that are _not strictly_ form-encoded. This can lead to decoding issues.

WebURL already supports them as generic URLs, but it would be nice to add APIs for extracting the MIME type and decoding base64-encoded data.
3. Non-form-encoded query parameters.
Additionally, we may want to consider making key lookup Unicode-aware. It makes sense, but AFAIK is unprecedented in other libraries and so may be surprising. But it does make a lot of sense. Feedback is welcome.

Looking beyond v1.0, the other features I'd like to add are:

A URL's `query` component is often used as a string of key-value pairs. This usage appears to have originated with HTML forms, which WebURL supports via its `formParams` view, but popular convention these days is also to use keys and values that are not _strictly_ form-encoded. This can lead to decoding issues.
3. Better APIs for `data:` URLs.

Additionally, we may want to consider making key lookup Unicode-aware. It makes sense, but AFAIK is unprecedented in other libraries and so may be surprising. But it does make a lot of sense.

WebURL already supports them as generic URLs, but it would be nice to add APIs for extracting the MIME type and decoding base64-encoded data.
4. APIs for relative references.

All `WebURL`s are absolute URLs (following the standard), and relative references are currently only supported as strings via the [`WebURL.resolve(_:)` method](https://karwa.github.io/swift-url/main/documentation/weburl/weburl/resolve(_:)).

It would be valuable to a lot of applications (e.g. server frameworks) to add a richer API for reading and manipulating relative references, instead of using only strings. We may also want to calculate the difference between 2 URLs and return the result as a relative reference.

5. IDNA
5. Support Internationalized Domain Names (IDNA).

This is part of the URL Standard, and its position on this list shouldn't be read as downplaying its importance. It is a high-priority item, but is currently blocked by other things.

There is reason to hope this may be implementable soon. Native Unicode normalization was [recently](https://github.com/apple/swift/pull/38922) implemented in the Swift standard library for String, and there is a desire to expose this functionality to libraries such as this one. Once those APIs are available, we'll be able to use them to implement IDNA.
<br/>
<br/>

# Sponsorship
# 💝 Sponsorship

I'm creating this library because I think that Swift is a great language, and it deserves a high-quality, modern library for handling URLs. It has taken a lot of time to get things to this stage, and there is an exciting roadmap ahead. so if you
(or the company you work for) benefit from this project, do consider donating to show your support and encourage future development. Maybe it saves you some time on your server instances, or saves you time chasing down weird bugs in your URL code.
I'm creating this library because I think that Swift is a great language, and it deserves a high-quality, modern library for handling URLs. It has taken a lot of time to get things to this stage, and there is an exciting roadmap ahead. so if you (or the company you work for) benefit from this project, do consider sponsoring it to show your support and encourage future development. Maybe it saves you some time on your server instances, or saves you time chasing down weird bugs in your URL code.
<br/>
<br/>

# FAQ
# ℹ️ FAQ

## How do I leave feedback?

Expand Down Expand Up @@ -191,14 +234,16 @@ Additionally, the benchmarks package available in this repository helps ensure t

It may be surprising to learn that there many interpretations of URLs floating about - after all, you type a URL in to your browser, and it just works! Right? Well, sometimes...

This [memo](https://tools.ietf.org/html/draft-ruby-url-problem-01) from the IETF network working group has a good overview of the history. In summary, URLs were first specified in 1994, and there were a lot of hopeful concepts like URIs, URNs, and scheme-specific syntax definitions. Most of those efforts didn't get the attention they would have needed and were revised by later standards such as [RFC-2396](https://datatracker.ietf.org/doc/html/rfc2396) in 1998, and [RFC-3986](https://www.ietf.org/rfc/rfc3986.txt) in 2005. Also, URLs were originally defined as ASCII, and there were fears that Unicode would break legacy systems, hence yet more standards and concepts such as IRIs, which also ended up not getting the attention they would have needed. So there are all these different standards floating around.
URLs were first specified in 1994, and were repeatedly revised over the years, such as by [RFC-2396](https://datatracker.ietf.org/doc/html/rfc2396) in 1998, and [RFC-3986](https://www.ietf.org/rfc/rfc3986.txt) in 2005. So there are all these different standards floating around - and as it turns out, they're **not always compatible** with each other, and are sometimes ambiguous.

While all this was going on, browsers were doing their own thing, and each behaved differently to the others. The web in the 90s was a real wild west, and standards-compliance wasn't a high priority. Now, that behavior has to be maintained for compatibility, but having all these different standards can lead to severe misunderstandings and even exploitable security vulnerabilities. Consider these examples from [Orange Tsai's famous talk](https://www.youtube.com/watch?v=voTHFdL9S2k) showing how different URL parsers (sometimes even within the same application) each think these URLs point to a different server.

In the mean time, browsers had been doing their own thing. The RFCs are not only ambiguous in places, but would _break the web_ if browsers adopted them. For URL libraries (e.g. cURL) and their users, web compatibility is really important, so over time they also began to diverge from the standards. These days it's rare to find any application/library which strictly follows any published standard -- and that's pretty bad! When you type your URL in to a browser or use one in your application, you expect that everybody involved understands it the same way. Because when they don't, stuff doesn't work and it may even open up [exploitable bugs](https://www.youtube.com/watch?v=voTHFdL9S2k).
![](abusing-url-parsers-example-orange-tsai.png) ![](abusing-url-parsers-example-orange-tsai-2.png)

So we're at a state where there are multiple, incompatible standards. Clearly, there was only one answer: another standard! 😅 But seriously, this time, it had to be web-compatible, so browsers could adopt it. For a URL standard, matching how browsers behave is kinda a big deal, you know?
_Images are Copyright Orange Tsai_

This is where the WHATWG comes in to it. The WHATWG is an industry association led by the major browser developers (currently, the steering committee consists of representatives from Apple, Google, Mozilla, and Microsoft), and there is high-level approval for their browsers to align with the standards developed by the group.
So having all these incompatible standards is a problem. Clearly, there was only one answer: yet another standard! 😅 But seriously, this time, it had to have browsers adopt it. For a URL standard, matching how browsers behave is kinda a big deal, you know? And they're not going to break the web, so it needs to document what it means to be "web compatible". It turns out, most URL libraries already include ad-hoc collections of hacks to try to guess what web compatibility means.

The WHATWG URL Living Standard defines how actors on the web platform should understand and manipulate URLs - how browsers process them, how code such as JavaScript processes them, etc.
This is where the WHATWG comes in to it. The WHATWG is an industry association led by the major browser developers (currently, the steering committee consists of representatives from Apple, Google, Mozilla, and Microsoft), and there is high-level approval for their browsers to align with the standards developed by the group. The latest WebKit (Safari 15) is already in compliance. The WHATWG URL Living Standard defines how **actors on the web platform** should understand and manipulate URLs - how browsers process them, how code such as JavaScript's `URL` class interprets them, etc. And this applies at all levels, from URLs in HTML documents to HTTP redirect requests. This is the web's URL standard.

By aligning to the URL Living Standard, this project aims to provide the behavior you expect, with better reliability and interoperability, sharing a standard and test-suite with your browser, and engaging with the web standards process. And by doing so, we hope to make Swift an even more attractive language for both servers and client applications.
Binary file added abusing-url-parsers-example-orange-tsai-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added abusing-url-parsers-example-orange-tsai.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 826f97c

Please sign in to comment.