FIP: Location in user profiles #196
Replies: 16 comments 45 replies
-
Is this a proposal to add a new |
Beta Was this translation helpful? Give feedback.
-
Personally I like the idea of Open Location Code as user data type |
Beta Was this translation helpful? Give feedback.
-
thanks for the proposal. this will unlock some new consumer use cases. geohash does have an advantage with better support in elastic and postgres (through ST_GeoHash) question: is the location going to be verified by clients? |
Beta Was this translation helpful? Give feedback.
-
Users must be able to delete any location data anytime they desire otherwise it's not privacy-friendly and not GDPR compliant. Warpcast already takes location without consent, and it leaks it openly example link: https://warpcast.com/~/locations/ChIJT608vzr5sUARKKacfOMyBqw It doesn't matter if is not pinpointed exactly. STATE-CITY is still regarded as PI under EU law. |
Beta Was this translation helpful? Give feedback.
-
Geospatial hash seems right to me. Here's my thinking:
|
Beta Was this translation helpful? Give feedback.
-
IMO the best type is a variable type because if any client introduces the data, data can't be verified anyway, and can be spoofed so data will always be true-thy but not truthful. With a variable type, a user can have any kind of defined location data type. From a privacy standpoint, the only requirement is to have a delete mechanism at protocol that doesn't leave traces(like delete messages that can be found) and to be opt-in, and not mandatory. Personally, I don't see many applications for location data, especially because if you have multiple clients is impossible to trust location, but I guess for some people there might be some limited applications even if data can't be trusted. |
Beta Was this translation helpful? Give feedback.
-
+1 for Geospatial hash
for any sort of distance based app (events, marketplace, recommendations, social) the approximate geographic distance to the +- 1km seems pretty important, to the degree that a city name is not good enough. Additionally as a developer having to reverse geocode city names/addresses to coordinates to calculate this is a real pain (and slow). OLC seems significantly simpler than H3. H3 seems fairly complex and optimized for running geo-algorithms like shortest path + optimization problems (but you can always convert from OLC -> H3 to do these if needed). Another note: as a dev, you'll probably need to compute a human readable label for each geospatial hash to display in the UI. For very precise hashes you can use the neighborhood or city, for less precise ones you may have multiple cities to choose from, in which case you perhaps choose the county or the region - but then the data has more accurate data on the user's location than is displayed to the user, which may be not expected by the user. The supported degrees of precision may be useful to restrict, or alternatively support a human readable label that is stored alongside the geospatial hash for the convenience of developers, as well as the convenience for the user to be able to override it if imprecise. This would also guarantee the user has a consistent human readable location label across apps |
Beta Was this translation helpful? Give feedback.
-
We have done some work with Geohashes and Crypto-spatial coordinates, check it out |
Beta Was this translation helpful? Give feedback.
-
the main benefit of geohashing seems to be that they are good for indexing since locations that are close together will have similar hash values. this isn't really a big consideration for the protocol and even warpcast which uses location data today doesn't use this feature. the main downside is that its a little more complicated than lat/long - there are multiple standards, there may not be great libraries or support for each standard in every framework. so i'm tempted to lean lat/long for its simplicity and because you can convert a lat/long into any kind of geohash if you want to use it for your own indexing purposes. |
Beta Was this translation helpful? Give feedback.
-
This looks great! I wrote about this a little here: I think just adding location to User Data is a good start - hopefully we can eventually see it extended to casts and even as parent URLs. Tactically, one thing I think we should think about is the degree of precision we're allowing. Like you said, with one decimal we get ~11km of precision. While I think that degree of precision is useful for some apps ("X user is in your city!"), I think this limits a lot of use cases. For reference, the the entire city of San Francisco is about 11km wide (from ocean to bay), so any client wanting to display regional hotspots in the city would not be able to due to the lack of granularity. Based off this chart: I think the protocol should allow for up to 3-4 decimal places of precision, and it should be up to the client to decide what degree gets published. I think most clients will go with the ~11km you suggest, but if there was a special feature a client was trying to produce I don't think they should be limited by the protocol |
Beta Was this translation helpful? Give feedback.
-
i really like the idea of permissionless user data location. it's a good first pass at more user-centric interop and less transaction or content-centric interop. i would like to see place id at some point for user location "status" in the near term, but we don't have a permissionless model over claiming a place. i also think that privacy is fundamentally important. self-governing down to 1 decimal point balances public data and privacy preferences. 👍 |
Beta Was this translation helpful? Give feedback.
-
Regarding the message format. I'm a bit skeptical about asking hubs to do regex matching to decide if a message is valid or not. If I was building an alt hub in Go or Rust or C, this would probably mean importing a library I did not need before. Maybe it's not a big deal right now, but do we want to add such dependency deep in the protocol? Also, regex matching is expensive. How could this affect hubs (or even sequencers) at 100x scale? Is it worth the cost? I would prefer if the regular expression was a soft requirement, imposed by clients: "Sure, you can put anything there that's X bytes long, but 99% of the clients will ignore it, if it does not match this regex". |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Quick update: |
Beta Was this translation helpful? Give feedback.
-
This is a very interesting use case, where location data in the user profile can be useful. https://warpcast.com/j-valeska/0xdcf5d3cf
|
Beta Was this translation helpful? Give feedback.
-
FYI the title is incorrect |
Beta Was this translation helpful? Give feedback.
-
Title: Location in user profiles
Type: Implementation FIP
Authors: @aditiharini, @sanjayprabhu, @christopher, @vrypan, @varunsrin
Abstract
Add a new User Data type that lets users share location data publicly.
Problem
If users opt-in to share location, clients can build features that:
Warpcast has implemented some location features at a client level, showing demand for this. We're proposing a standard to publish this data to the user's profile.
Specification
Our requirements for the location data type are:
Encoding
Location values as encoded as latitude/longitude pairs rounded down to two decimal places. For example, the value for the city of Los Angeles would be
34.05, -118.24
. A 2dp rounded lat/long provides an approximate precision of 11 square kilometers around the point, which prevents someone from publishing sensitive location data by accident. These values can easily be converted to and from other representations like geohashes or google's place id for use in applications.Message Format
The UserDataType is extended to support an eighth value for
USER_DATA_LOCATION
. All User Data messages with this type must also pass the following validation rules:geo:-?[0-90]\.[0-9]{2},-?[0-180]\.[0-9]{2}
which is compliant with the geo URI standard. Latitude and longitude must be specified with exactly 2 decimal place precision.Example:
geo:-34.56,123.45
Rationale
Why not use an unstructured string?
A string with a city description is very difficult to use because of lack of standardization of location names (e.g. New York vs New York City) and people's tendency to use different levels of precision (e.g. New York vs Brooklyn)
Why are the latitude/longitude values rounded?
A user or application may accidentally publish a sensitive location like someone's home using unrounded values. Since Farcaster is a public protocol this data may end up indexed in databases and become impossible to erase. Rounding ensures that location data is not traceable to a specific building.
Why not use geohashing instead of latitude/longitude values?
Geohashing has a couple of benefits.
A geohash's main benefit is database indexing. Locations that are close together with have lexicographically similar hashes which is useful if you are processing very high volumes of location-based queries. While this might be useful in some applications, it's quite trivial to convert a lat/long into a geohash format if necessary.
Second, the precision of a geohash is encoded in the representation. You can easily identify the precision via the length of the geohash string. Latitude, longitude values can be represented as strings, integers (given fixed precision), or floating point numbers and validating that the values have a certain level of precision requires more complicated logic (as compared to a string length check). The complexity here is localized and taken on by hubs rather than applications.
The main downside of a geohash is that developers will almost certainly need to convert a geohash into a latitude, longitude pair before using their preferred location service as most support lat/long out of the gate and not geohash. Doing this requires understanding the algorithm and picking a library to do the translation, which is a considerable effort. Some geohash variants have limited library support and are pretty complicated to understand.
It's also hard to pick a single geohash-style algorithm to use in the protocol as there are multiple standards with different tradeoffs. The ideal algorithm choice depends on how the data is being used.
It is simpler and most ergonomic from a develop perspective to just publish values as lat/long and let developers convert them into a specific geohash format if necessary.
Why not represent the latitude/longitude values using JSON or protobuf?
UserDataBody
message in a custom way just for location data since all other user data is represented as a string. Ultimately, it's unclear that the space vs ergonomics tradeoff is worth it here and we will migrate to a more space efficient location later if this feature is widely adopted and the space savings seem worth it.Appendix
Comparing results of 1dp vs 2dp rounding
The results show that 1dp rounding often results in translating one locality to some nearby locality, even for pretty major cities. Accepting this would be at the detriment of the UX of the location features within and/or across apps.
Converting Lat/Long coordinates to a Place ID
Beta Was this translation helpful? Give feedback.
All reactions