diff --git a/.sphinx/_templates/header.html b/.sphinx/_templates/header.html index 54cc0a9..de1e65f 100644 --- a/.sphinx/_templates/header.html +++ b/.sphinx/_templates/header.html @@ -16,6 +16,14 @@ {{ product_page }} + + + + + +Curtis thinks this specific example is dubious, and if this is +representative, we need to really understand what users of these old +browser can do on Launchpad. Recent changes to forms only support recent +browsers. Many AJAX operations do not work with IE6 for example, it +making the site look pretty will not help them use Launchpad. +Non-developers need to report bugs and ask questions: Ubuntu users use +their computer to report bugs, Answers could be a desktop app that talks +to Lp's API. + +YUI widgets +----------- + +YUI widgets should maintain YUI conventions. They should live with the +javascript for the widget. Where possible the CSS written for a YUI +widget should follow our conventions, but only as secondary to YUI +conventions. diff --git a/explanation/database-performance.rst b/explanation/database-performance.rst new file mode 100644 index 0000000..af5c204 --- /dev/null +++ b/explanation/database-performance.rst @@ -0,0 +1,281 @@ +Database performance +==================== + +.. include:: ../includes/important_not_revised.rst + +Poor query times - looks right, takes ages +------------------------------------------ + +Normally, the simplest form of the query is the fastest. + +Checklist of known problems +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +1. subselects, EXISTS, IS IN - these can be useful for optimizing + particular edge cases, but can degrade performance for the normal + cases. + +2. database views. These used to be useful to keep our code sane, but it + is now clearer to use Storm to express the view logic from the client + side. Database views now serve little purpose for us except to hide + unnecessary joins that will degrade performance. + +3. bad query plans generated on the DB server - talk to a team lead to + get an explain analyze done on staging, or to Stuart or a LOSA to get + the same done on production (if the staging one looks ok its + important to check production too). + +``   Bad plans can happen due to out of date statistics or corner cases.  Sometimes rewriting the query to be simpler/slightly different can help. Specific things to watch out for are nested joins and unneeded tables (which can dramatically change the lookup).`` + +4. fat indices - if the explain looks sensible, talk to Stuart about + this. + +5. missing indices - check that the query should be indexable (and if in + doubt chat to Stuart). + + +6. using functions in ORDER BY: calling functions on every row of an + intermediary table - if a sort cannot be answered by iterating an + index then postgresql will generate an intermediate table containing + all the rows that match the constraints, and sort that in-memory; + functions that affect the sort order have to be evaluated before + doing the sort - and thus before any LIMIT occurs. + +7. Querying for unrelated tables. Quite possibly either prejoins, or + prepopulation of derived attributes. Look for a code path that is + narrower, or pass down a hint of some sort about the data you need so + the actual query can be more appropriate. Sometimes more data is + genuinely needed but still messes up the query: consider using a + later query rather than prejoining. E.g. using the pre_iter_hook of + DecoratedResultSet to populate the storm cache. + +Many small queries [in a webapp context] -- Late Evaluation +----------------------------------------------------------- + +Databases work best when used to work with sets of data, not objects - +but we write in python, which is procedural and we define per-object +code paths. + +One particular trip up that can occur is with related and derived data. + +Consider: + +:: + + def get_finished_foos(self): + return Store.of(self).find(Foo, And(Foo.me == self.id, Foo.finished == True)) + +This will perform great for one object, but if you use it in a loop +going over even as few as 30 or 40 objects you will cause a large amount +of work - 30 to 40 separate round trips to the database. + +Its much better to prepopulate a cache of these finished_foos when you +request the owning object in the first place, when you know that you +will need them. + +To do this, use a Tuple Query with Storm, and assign the related objects +to a cached attribute which your method can return. For attributes the + +:: + + @cachedproperty('_foo_cached') + +can be used to do this in combination with a + +:: + + DecoratedResultSet + +Be sure to clear these caches with a Storm invalidation hook, to avoid +test suite fallout. Objects are not reused between requests on the +appservers, so we're generally safe there. (Our storm and sqlbase +classes within the Launchpad tree have these hooks, so you only need to +manually invalidate if you are using storm directly). + +A word of warning too - Utilities will often get in the way of +optimising this :) + +Diagnosis Tools and Approaches +------------------------------ + +EXPLAIN ANALYZE on staging and qastaging can be used by LOSAs, the TAs, +and squad leads. + +If you want to see how a query is working on a GET page locally, try the +`++oops++ `__ and +`++profile++ `__ tools. +++profile++ reportedly works on staging and qastaging now too. + +Unfortunately, they sometimes do not work properly for POSTs, and can't +be used in other scenarios. See Bug:641969, for instance. + +If you are working on a test and want to see how a query is working, try +one of these tools. + +- Use the built-in Storm debug tracer. If you start with this... + +:: + + def test_some_storm_code(self): + < some setup logic > + < the Storm-using code I'm curious about > + < more stuff > + + +``...then you can use the debug tracer to see what's going on.  When you run your tests after changing the code to look like the below, stdout will include the queries run, and timestamps for start and finish.`` + +:: + + def test_some_storm_code(self): + < some setup logic > + from storm.tracer import debug; debug(True) + try: + < the Storm-using code I'm curious about > + finally: + debug(False) + < more stuff > + + +- StormStatementRecorder, LP_DEBUG_SQL=1, LP_DEBUG_SQL_EXTRA=1, + QueryCollector. In extremis you can also turn on statement logging in + postgresql. [Note: please add more detail if you are reading this and + have the time and knowledge.] +- Raise an exception at a convenient point, to cause a real OOPS. + +Efficient batching of SQL result sets: StormRangeFactory +-------------------------------------------------------- + +Batched result sets are rendered via the class +canonical.launchpad.webapp.bachting.BatchNavigator. (This class is a +thin wrapper around lazr.batchnavigator.BatchNavigator.) + +BatchNavigator delegates the retrieval of batches from a result set to +an IRangeFactory (defined in lazr.batchnavigator.interfaces). The +default range factory is lazr.batchnavigator.ListRangeFactory. + +This factory uses regular Python slicing to access a batch, which is +mapped by the ORM to a query like + +:: + + SELECT ... FROM ... OFFSET o LIMIT l; + +for a slice operation result_set[o:o + l]. + +Finding the end of the result set, and skipping to the right offset, can +be very expensive for result sets with large numbers of rows. +StormRangeFactory uses a different approach: Given a query + +:: + + SELECT * FROM Table ORDER BY Table.column1, Table.column2; + +and given a batch where the values of column1, column2 in last row of +the batch are value1, value2, it generates a query for the next batch by +adding a WHERE clause: + +:: + + SELECT * FROM Table WHERE (column1, column2) > (value1, value2) + ORDER BY Table.column1, Table.column2 LIMIT batchsize+1; + +Usage +~~~~~ + +The main change to use StormRangeFactory is simple: Just replace + +:: + + batchnav = BatchNavigator(resultset, request) + +with + +:: + + from canonical.launchpad.webapp.batching import StormRangeFactory + batchnav = BatchNavigator( + resultset, request, range_factory=StormRangeFactory(resultset)) + +Limitations +~~~~~~~~~~~ + +StormRangeFactory needs access to the columns used for sorting; it +retrieves the values of the sort columns automatically, using the +resultset.order_by() parameters. This has several consequences: + +1. Result sets must be entire model objects: + +:: + + store.find(Person) + store.order_by(Person.id) + +can be used with StormRangeFactory, but + +:: + + store.find(Person.id) + store.order_by(Person.id) + +can not be used. + +2. The order_by parameters must be Storm Column instances, optionally +wrapped into Desc(). + +:: + + resultset.order_by(Person.id) + resultset.order_by(Desc(Person.id)) + +works, but + +:: + + resultset.order_by('Person.id') + +does not work. + +3. Obviously, all sort columns must appear in the result set. This means +that + +:: + + resultset = store.find( + BugTask, BugTask.product == Product, Product.project == context) + resultset.order_by(Product.name, BugTask.id) + +does not work. Use + +:: + + resultset = store.find( + (BugTask, Product), BugTask.product == Product, + Product.project == context) + resultset.order_by(Product.name, BugTask.id) + +instead and wrap this result set into DecoratedResultSet. + +4. StormRangeFactory works only with regular Storm ResultSets and with +DecoratedResultSets, but not with legacy SQLObjectResultSets. + +:: + + resultset = store.find(Person) + resultset.order_by(Person.id) + +works, but + +:: + + resultset = Person.select(...) + +does not work. + +5. The begin of a batch is represented in URLs by the query parameter + +For BatchNavigator, this parameter is an arbitrary string. +StormRangeFactory uses a class +DateTimeJSONEncoder(simplejson.JSONEncoder) to represent the sort column +values as a string. This means that only data types supported by +simplejson and datetime instances may be used for sorting the SQL result +set. diff --git a/explanation/datetime-usage.rst b/explanation/datetime-usage.rst new file mode 100644 index 0000000..4d58857 --- /dev/null +++ b/explanation/datetime-usage.rst @@ -0,0 +1,186 @@ +Datetime Usage Guide +==================== + +.. include:: ../includes/important_not_revised.rst + +There are a number of places in Launchpad where ``datetime`` types are used: + +- Python code +- in the database as table columns +- Storm wrappers for database tables, which act as an adapter between + the above two +- TALES fmt:date , fmt:time and fmt:datetime formatters. + +Furthermore, there are two main ``datetime`` types in use: + +- timestamps, which identify a particular point in time +- time deltas, which identify an interval in time + +Data Types +---------- + +Python +~~~~~~ + +We use the standard ``datetime`` module to represent time stamps and time +deltas -- the ``datetime.datetime`` type for timestamps, and the +``datetime.timedelta`` type for time deltas. + +To make matters a little bit more complicated, there are actually two +types of ``datetime.datetime`` objects: + +1. naïve datetime objects +2. timezone aware datetime objects + +While both objects share the same Python type, they can not be compared +with each other. Where possible, we use timezone aware ``datetime`` objects. + +A timezone aware ``datetime`` can be created with the following code: + +.. code:: python + + import datetime + import pytz + + UTC = pytz.timezone('UTC') + dt = datetime.datetime(2005, 1, 1, 8, 0, 0, tzinfo=UTC) + +The ``pytz.timezone()`` function can be used to retrieve tzinfo objects for any +of the named Olsen time zones. A ``datetime`` value can be converted to another +time zone as follows: + +.. code:: python + + perth_tz = pytz.timezone('Australia/Perth') + perth_time = dt.astimezone(perth_tz) + +PostgreSQL +~~~~~~~~~~ + +In PostgreSQL, the ``TIMESTAMP WITHOUT TIME ZONE`` should be used to represent +timestamps, and ``INTERVAL`` should be used to represent time deltas. +All timestamp columns in the database should store the time in UTC. + +While PostgreSQL has a ``TIMESTAMP WITH TIME ZONE`` type, it should not be used. +The difference between the two column types is that the value of a +``TIMESTAMP WITH TIME ZONE`` column will be converted to local time when being +read, and the reverse occurs when being written. +It does **not** actually store a time zone with the timestamp. + +Storm +~~~~~ + +To wrap a timestamp database column, use the ``storm.properties.DateTime`` + +type. To wrap an interval database column, use the +`storm.properties.TimeDelta`` type: + +.. code:: python + + import pytz + from storm.properties import ( + DateTime, + TimeDelta, + ) + + from lp.services.database.stormbase import StormBase + + class TableName(StormBase): + timestamp = DateTime(name='timestamp', tzinfo=pytz.UTC) + interval = TimeDelta(name='interval') + +Page Templates +~~~~~~~~~~~~~~ + +Inside page templates, use the following TALES formatters to present +timestamp objects: + +- ``fmt:date`` +- ``fmt:time`` +- ``fmt:datetime`` +- ``fmt:approximatedate`` + +The preferred method of presenting datetime is: + +.. code:: xml + + + +When in doubt, use this presentation. + +If the timestamp has a time zone attached, these formatters will convert +the date to the user's local time before display. + +For time interval objects, use the following formatters: + +- ``fmt:exactduration`` +- ``fmt:approximateduration`` + +Two Concepts of "Now" +--------------------- + +When working with the database, there are two distinct concepts of "now" +to work with: + +1. the time when the code is running (e.g. returned by datetime.now() ). +2. the database transaction time (when the transaction is committed, all + the changes will appear to have happened atomically at that time). + +Usually these two mean almost the same thing, but they will differ under +the following conditions: + +- clock skew between the application server and database server (should + not be a problem on our servers). +- with long running transactions, the second "now" will be the time at + the start of the transaction. + +In cases where you are comparing timestamps, mixing the two concepts of +"now" can result in race conditions. In most cases in Launchpad, the +database transaction time is the correct one to use. + +Database Transaction Time +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Storing the current database transaction time in the database use the +following syntax: + +.. code:: python + + from lp.services.database.constants import UTC_NOW + + person.datecreated = UTC_NOW + +.. note:: + + You won't be able to read the value as a Python ``datetime`` + object until the object is flushed to the database, or the transaction + is committed. + +To store a time relative to the present time in a database column, we +can make use of the fact that ``UTC_NOW`` is an ``SQL()`` type: + +.. code:: python + + membership.dateexpires = UTC_NOW + datetime.timedelta(months=6) + +The database transaction time can be retrieved using +``lp.services.database.sqlbase.get_transaction_timestamp``. + +Present Time +~~~~~~~~~~~~ + +To create a Python ``datetime`` object that represents the present time, use +the following code: + +.. code:: python + + import datetime + import pytz + + UTC = pytz.timezone('UTC') + dt = datetime.datetime.now(UTC) + +Note that the ``datetime.utcnow()`` method should not be used -- it creates a +naïve ``datetime`` value, which can not be compared against other values in +Launchpad. diff --git a/explanation/engineering-overview-translations.rst b/explanation/engineering-overview-translations.rst new file mode 100644 index 0000000..90dda9c --- /dev/null +++ b/explanation/engineering-overview-translations.rst @@ -0,0 +1,560 @@ +Engineering Overview: Translations +================================== + +.. include:: ../includes/important_not_revised.rst + +This is for engineers who are already familiar with the Launchpad +codebase, but are going to work on the Translations subsystem. + +Use cases +--------- + +The purpose of Launchpad Translations is to translate programs' +user-interface strings into users' natural languages. To that end it +supports online translation, offline translation, uploads of translation +files from elsewhere, generation and download of translation files, +import from bzr branches, export to bzr branches, exports of language +packs, and so on. Something we're not very good at yet is helping users +bring Launchpad translations back upstream. + +We've got two major uses for Translations: + +1. Ubuntu and derived distributions. +2. Launchpad-registered projects. + +Sometimes we refer to these as the two “sides” of translation in +Launchpad: the *ubuntu side* and the *upstream side.* + +Where possible, the two sides are unified (in technical terms) and +integrated (in collaboration terms). But you'll see a lot of cases where +they are treated somewhat differently. Permissions can differ, +organizational structures differ, and some processes only exist on one +side or the other. + +At the most fundamental level, the two sides are integrated through: + +- *global suggestions* — "here's a translation that was used / + suggested elsewhere for this same string" +- *translations sharing* — individual translations of the same string + can be shared in multiple places. This is a complex and multi-layered + affair that you'll see coming back later in this document. + +Ubuntu side +~~~~~~~~~~~ + +In a distribution, translation happens in the context of a source +package. That is, a given ``SourcePackageName`` in a given +``DistroSeries``. + +Translations sharing happens within a source package, between different +distribution release series. + +Most translations come in from upstream (Debian, Gnome), but we have a +sizeable community of users completing and updating these translations in +Launchpad. + +Ubuntu has a team of `translations +coordinators `__ +in charge of this process. + +Projects side +~~~~~~~~~~~~~ + +In a project, translation happens in the context of a project release +series. That is, a ``ProductSeries``. + +Translations sharing happens between the release series of a single +project. + +Project groups also play a small role in permissions management, but we +otherwise pretend they don't exist. + +Structure and terminology +------------------------- + +Essentially all translations in Launchpad are based on +`gettext `__. Software +authors mark strings in their codebase as translatable; they then use +the gettext tools to extract these and get them into Launchpad in one of +several ways. We also call the translatable strings *messages.* +Translatable strings are presumed to be in U.S. English (with language +code \`en`). + +The top-level grouping of translations is a *template.* A +\`ProductSeries\` or \`SourcePackage\` can contain any number of +templates; typically it needs only one or two for the main program, a +main library that the program is built around, and so on; on the other +hand some projects create a template for each module. + +Because of our gettext heritage, we also refer to these templates as +“POTs,” “PO templates,” or “pot files.” + +In python terms, think: + +:: + + productseries.potemplates = [potemplate1] + potemplate1.productseries = productseries + + sourcepackage.potemplates = [potemplate2] + potemplate2.sourcepackage = sourcepackage + +A template can be on only one “side”; it belongs to a product series or +to a source package, but not both. + +Each template can be translated to one or more languages. Again because +of our gettext heritage, translation of a template into a language is +referred to as a *PO file.* A PO file is not just a shapeless bag of +translated messages; it specifically translates the messages currently +found in its template. + +In python terms: + +:: + + potemplate.pofiles = { + language: pofile, + } + + pofile.language = language + pofile.potemplate = potemplate + +(A gettext PO file is pretty much the same as a template file. A bit of +metadata aside, the big difference is that a template leaves the +translations blank.) + +The currently translatable messages in a template (“pot message sets”) +form a numbered sequence. This sequence defines which messages need to +be translated in the PO files. Messages that are no longer in the +template are *obsolete*; we may still track them but they are no longer +an active part of the template. + +In python terms, think: + +:: + + potemplate.potmsgsets = [potmsgset1] + potemplate.obsolete_potmsgsets = set([potmsgset2]) + +Think of a translated string in a PO file as a *translation message.* +This gets a bit more complicated once you start looking at the database +schema, but from the perspective of a PO file it's accurate. + +:: + + translation_message1.potmsgset = potmsgset1 + translation_message1.language = pofile.language + +The actual translation text in a translation message is immutable. A +translation message will be updated with review information and such, +but its “contents” are fixed. From the model's perspective there's no +such thing as *changing* a translated string; that just means you create +or select a different translation message. + +A translation message can be *current* in a given PO file, or not. It's +an emergent property of more complex shared data structures. So you can +view a PO file as a customizable “view” on the current translations of a +particular template into a given language. + +:: + + pofile.current_translation_messages = { + potmsgset1: translation_message1, + } + +Often a translation message translates a message from a PO file's +template into the PO file's language, but is not current (from the +perspective of that PO file). In that case we consider it a +*suggestion.* We make it easy for users with the right privileges to +select suggestions to become current translations. + +:: + + pofile.suggestions = { + potmsgset2: [translation_message2], + } + +Plural forms +~~~~~~~~~~~~ + +A language can have one or more *plural forms.* These are the forms a +message can take depending on the value of a variable number that is +substituted into the message. For example, English has 2 forms: a +singular (“%d file” for 1) and plural (“%d files” for all other +numbers). Many languages lack this distinction; some are just like +English; some use the singular for the number zero; and some have more +forms, such as Arabic which has 6. + +GNU gettext knows how to choose the right form and substitute the +variable in one go. We define a *plural formula* for each language, and +that's what determines which form should be used for which numbers. + +Thus sometimes a translatable message that includes a number may +actually consist of 2 strings (one for singular, one for plural). +Similarly a translation message may contain one string per plural form +in the language. Only very few translatable messages need a plural form +though; most translatable messages and translations consist of a single +string each. + +Workflow +-------- + +Everything starts with templates. Usually a project owner or package +maintainer somewhere outside of Launchpad is responsible for producing +these and uploading them to Launchpad. There are a few automated streams +though: Soyuz package builds can produce them. For projects we can +import them from the development branch, and in some cases we can even +generate them from there automatically. + +A template is the one thing that absolutely every project must provide +before it can be translated. There is no way to edit a template's +contents in the web UI; it has to be imported as a file. + +Once a template has been created, and it contains translatable messages, +people can start translating. They can do this through the web UI, or +they can upload translation files in much the same was as the project +owners can upload template files. Translations can also be imported from +a bzr branch, just like a template. + +Depending on the wishes of the project owner, translation can be a +single-stage process (“people enter translations”) or a two-stage +process (“translators enter translations, reviewers check them and +approve the good ones”). + +Naturally, translations can be exported. The application will generate +PO files and templates on the fly based on the data in the database. It +can generate individual files, or tarballs for aggregate downloads. On +the Ubuntu side, there is also a mechanism for generating language packs +that is largely independent from the normal export mechanism. + +Suggestions and translations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Looking within any given POFile, a POTMsgSet can have any number of +translations, of varying degrees of applicability: + +1. At most one \`TranslationMessage\` is its *current* translation. This + is actually more complex than it seems: it could be shared or + diverged, and may be current on both translation sides or only on the + side the POFile is on. +2. Other translations may have been submitted for the same POTMsgSet and + language, but not be current in this POFile. We call these *local + suggestions*. Technically, even a previously current translation + that's been superseded by a better one is considered a suggestion; + usually we only care about ones that were submitted after the + incumbent translation was made current. +3. There could be other POTMsgSets elsewhere with the exact same + translatable string. Any translation messages for those, to the same + language as our POFile, are called *global suggestions*. + +When translating in the UI, a suggestion shows up as a ready-made +translation that you can just select and approve. This, in principle, is +what translation teams do. They review suggestions made by translators. +Translation itself generally requires no special privileges. + +A user with review privileges on a given translation can operate in +“reviewer mode,” where everything they enter automatically becomes +current, or in “translator mode” where any translations they enter go +into the system as suggestions. + +Global suggestions are one of Launchpad Translations' key features. If +you want to translate the string “Quit” into Japanese for your program, +there are probably only one or two translations that everybody else +uses. Launchpad will show you those so that you don't need to come up +with your own. It's an example of horizontal integration between +projects. + +Local suggestions are shared. There is no such thing as a diverged +suggestion. When you enter a suggestion for a given POTMsgSet and +Language, it applies to all POFiles for the same language that share +that same POTMsgSet. + +Uploads +~~~~~~~ + +**TODO:** + +- Templates generally need uploading. +- Currently privileged. +- Per series, per template, per PO file. +- One queue record per user / POFile / …. + +Automated uploads +~~~~~~~~~~~~~~~~~ + +We have a few streams of automated uploads: + +- Templates and translations from distro package builds (*Soyuz + uploads*). +- Import of templates and translations from Bazaar branches (*bzr + integration*). +- For some projects we support automatic generation of templates from + source updates, using the build farm (*template generation*). + +Some of these streams have their own custom “approval” logic for +figuring out which file should be imported where. This is because +automated processes give us more consistent file paths and such. If the +custom logic fails to match uploads with PO files or templates, the work +is generally left to the import queue gardener. + +Soyuz uploads are different in that regard: all its custom logic is +built into the gardener because the two developed hand in hand. Mainly +for this reason, the gardener's approval logic is fiendishly complex. + +Permissions and organization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Message sharing +~~~~~~~~~~~~~~~ + +**TODO:** + +- Why divergence? +- Why can't messages always track? + +Objects and schema +------------------ + +See my horrible `schema overview `__ +(dia format). + +In a nutshell: + +- A \`POTemplate\` lives in either a \`ProductSeries\` or a + \`SourcePackage\` (in the database: \`(SourcePackagename, + DistroSeries)`). +- A \`POFile\` is the translation of a \`POTemplate\` to a \`Language`. +- A \`POTMsgSet\` is a translatable message. +- \`TranslationTemplateItem\` is a linking table. It says which + \`POTMsgSet`s participate in which \`POTemplate`s. +- A \`TranslationMessage\` translates a \`POTMsgSet\` to a \`Language\` + on one translation side, or if diverged, specifically in one + \`POTemplate`. +- A \`POMsgID\` holds the text of a translatable string; \`POTMsgSet\` + refers to it (once for singular, once for plural if appropriate). +- A \`POTranslation\` holds the text of a translated string; + \`TranslationMessage\` refers to it (once for every plural form in + the language). +- A \`TranslationGroup\` is an organizational structure for managing + translations. +- A \`Translator\` is an entry in a \`TranslationGroup\` saying who is + responsible for a particular \`Language`. +- A \`POExportRequest\` is an entry in the translations export queue. +- A \`TranslationImportQueue\` entry is an upload on the translations + import queue. + +Our largest database table is \`TranslationMessage`. Once upon a time it +grew to 300 million rows, but thanks to message sharing it's now a +fraction of that size. + +Message sharing +~~~~~~~~~~~~~~~ + +A single \`POTMsgSet\` (translatable message) can participate in +multiple templates. We then call these templates *sharing templates.* +And that means that a translation message to, say, Italian will be +available in each of those templates' PO file for Italian. + +This is where it gets complicated; please fasten your seatbelts and +extinguish smoking motherboards. + +A translation message can be in one of three sharing states: + +1. **Diverged.** The translatable message may be in multiple templates, + but this particular translation of it is specific to just one of + those templates. +2. **Shared.** The translation is valid for all the PO files *on this + translation side* whose templates share the same translatable + message. Some of those PO files may have diverged translations + overriding it, but this one is the default. +3. **Tracking.** The translation is not only shared on one translation + side, but between both translation sides. + +We have a `design +document `__ +that specifies how messages in these states respond to changes. We try +to make it easy to move a translation down this list (towards tracking) +and hard to move up the list (towards diverged). + +Which message is current? +^^^^^^^^^^^^^^^^^^^^^^^^^ + +The usage and sharing state of a translation message is recorded as +three data items: + +- “current on the Ubuntu side” +- “current on the upstream side” +- “diverged for template X” + +(By the way, that leaves some redundant possibilities: a diverged +message can only be current on the side of the template it's specific +to. And a message shouldn't be diverged if it's not current.) + +So given a PO file and a translatable message, how do you find the +current translation message? Look for one with: + +- the same language as the PO file, +- the translatable message you're looking for, +- its “current” bit set for the side your PO file's template is on, and +- diverged to your template or, if no message matches, not diverged at + all. + +(On a sidenote, this is why “simple” translation statistics can be quite +hard to compute.) + +Which templates share? +^^^^^^^^^^^^^^^^^^^^^^ + +There are two separate notions of which templates share. You'd expect +these to be the same thing, but reality gets a bit more complicated: + +- **Sharing templates** have one or more translatable messages in + common, regardless of how that happened. +- An **equivalence class** is a set of templates that *should* share + their messages if possible, regardless of whether they actually do. + +Why the difference? Sharing templates is a useful term for reasoning +about data, but as a rule the code doesn't care about them (and would +find it a costly thing to query if it did). But when the application +adds a translatable message to a template X, it does care about +equivalence classes. If another template Y in the same equivalence class +already has the same translatable string, no new message is created; the +existing one from Y is simply added to X. + +After that, lots of things can happen: templates can be renamed, moved, +added, deleted; administrators may have to change data by hand. And +that's where differences between "sharing templates" and the +"equivalence class" can sneak in. But in principle they should be more +or less the same. + +An equivalence class consists roughly of all templates with the same +name, in a project and its associated Ubuntu package. Look at +\`POTemplateSharingSubset\` for the details. + +Processes +--------- + +Import queue +~~~~~~~~~~~~ + +**TODO:** Describe. + +Gardener +^^^^^^^^ + +**TODO:** Describe. + +Export queue +~~~~~~~~~~~~ + +**TODO:** Describe. + +Language packs +~~~~~~~~~~~~~~ + +**TODO:** Describe. + +Bazaar imports +~~~~~~~~~~~~~~ + +**TODO:** Describe. + +Bazaar exports +~~~~~~~~~~~~~~ + +**TODO:** Describe. + +Template generation +~~~~~~~~~~~~~~~~~~~ + +**TODO:** Describe. + +Statistics update +~~~~~~~~~~~~~~~~~ + +**TODO:** Describe. + +Packaging translations +~~~~~~~~~~~~~~~~~~~~~~ + +**TODO:** Describe. + +Translations pruner +~~~~~~~~~~~~~~~~~~~ + +**TODO:** Describe. + +Caches +------ + +POFileTranslator +~~~~~~~~~~~~~~~~ + +Here is Launchpad Translations' equivalent of Cobol: old, ugly, in +desperate need of an overhaul — and as yet, irreplaceable. + +\`POFileTranslator\` caches who contributed to which \`POFile`s, and +when. It's always been the basis for listing contributors, but we have +started using it for more. It's how we list a user's translation +activity on their personal Translations page. We're not sure the table +is really a very good fit for that; its purpose has become diluted and +we haven't fully reviewed how it matches our needs. + +The \`POFileTranslator\` table is kept very roughly consistent by a +database trigger on \`TranslationMessage`. But we decided to move that +work into python. That would be particularly useful for translation +imports, where we do mass updates to (mostly) a single set of PO files +for a single users. + +Status as of 2012-05-14: the trigger is about to be simplified so that +it takes care of creating new \`POFileTranslator\` records as needed, +but it will no longer try to clean up ones that become obsolete, or keep +track of the most recent \`TranslationMessage\` a user contributed. A +daily “scrubbing” job is about to land as part of Garbo, which will take +care of eventual consistency. + +SuggestivePOTemplate +~~~~~~~~~~~~~~~~~~~~ + +The database query that looks for global suggestions is relatively +costly, and we need to do it for every translatable message on a +translation page. Its complexity also makes the SQL logs hard to follow. + +A large part of this query (in terms of SQL text) was involved in +finding out what templates were eligible for taking suggestions from. +This part was also completely repetitive, and it doesn't even need to be +immediately consistent, so we materialized it as a simple cache table +called \`SuggestivePOTemplate`. + +We refresh this cache all the time by clearing out the table and +rewriting it. This keeps the code simple and it's certainly fast enough +— we used to gather the same data 10× per page. Some changes may also +update it incrementally. + +So don't worry if anything happens to this data. It will be rewritten +very soon. + +POTExport +~~~~~~~~~ + +This isn't really a cache, but it was sort of meant as one. It's a +database view on a join that was apparently once meant to speed up +translation exports. To express that a view is involved, the model class +is called \`VPOTExport`. + +In all probability though, this does not help performance in any way +whatsoever. There used to be a similar view for translations, called +\`POExport`, and removing it from our code has done a lot to speed up +exports. It also simplified the code. + +But removing \`POTExport\` has never become a priority. It's simpler +than \`POExport\` was, so probably less costly; and as a rule there's +much less template data to export than there is translation data. So +getting rid of this would be a nice cleanup, but not vital. + +UI shortcuts +------------ +* Import queue: https://translations.launchpad.net/+imports +* Languages: https://translations.launchpad.net/+languages +* Translation groups: https://translations.launchpad.net/+groups + diff --git a/explanation/error-explanations.rst b/explanation/error-explanations.rst new file mode 100644 index 0000000..9e82f0a --- /dev/null +++ b/explanation/error-explanations.rst @@ -0,0 +1,88 @@ +Error explanations +================== + +This page is to aid debugging unhelpful error messages. + +``AttributeError: 'NoneType' object has no attribute 'poll'`` +------------------------------------------------------------- + +.. code:: pytb + + Running canonical.testing.layers.TwistedAppServerLayer tests: + Set up canonical.testing.layers.ZopelessLayer in 3.350 seconds. + Set up canonical.testing.layers.DatabaseLayer in 0.457 seconds. + Set up canonical.testing.layers.LibrarianLayer in 4.786 seconds. + Set up canonical.testing.layers.LaunchpadLayer in 0.000 seconds. + Set up canonical.testing.layers.LaunchpadScriptLayer in 0.001 seconds. + Set up canonical.testing.layers.LaunchpadZopelessLayer in 0.000 seconds. + Set up canonical.testing.layers.TwistedLaunchpadZopelessLayer in 0.000 seconds. + Set up canonical.testing.layers.TwistedAppServerLayer in 31.229 seconds. + Running: + test_lock_with_magic_id (canonical.codehosting.puller.tests.test_scheduler.TestPullerMasterIntegration)Traceback (most recent call last): + File "./test.py", line 190, in ? + result = testrunner.run(defaults) + File "/home/mwh/canonical/checkouts/trunk/lib/zope/testing/testrunner.py", line 238, in run + failed = not run_with_options(options) + File "/home/mwh/canonical/checkouts/trunk/lib/zope/testing/testrunner.py", line 403, in run_with_options + setup_layers, failures, errors) + File "/home/mwh/canonical/checkouts/trunk/lib/zope/testing/testrunner.py", line 585, in run_layer + return run_tests(options, tests, layer_name, failures, errors) + File "/home/mwh/canonical/checkouts/trunk/lib/zope/testing/testrunner.py", line 513, in run_tests + test(result) + File "/home/mwh/canonical/checkouts/trunk/lib/twisted/trial/unittest.py", line 632, in __call__ + return self.run(*args, **kwargs) + File "/home/mwh/canonical/checkouts/init-stack-pull/lib/canonical/codehosting/puller/tests/test_scheduler.py", line 681, in run + return TrialTestCase.run(self, result) + File "/home/mwh/canonical/checkouts/trunk/lib/twisted/trial/unittest.py", line 961, in run + result.stopTest(self) + File "/home/mwh/canonical/checkouts/trunk/lib/twisted/trial/unittest.py", line 1158, in stopTest + self.original.stopTest(method) + File "/home/mwh/canonical/checkouts/trunk/lib/zope/testing/testrunner.py", line 871, in stopTest + self.testTearDown() + File "/home/mwh/canonical/checkouts/trunk/lib/zope/testing/testrunner.py", line 755, in testTearDown + layer.testTearDown() + File "/home/mwh/canonical/checkouts/init-stack-pull/lib/canonical/testing/profiled.py", line 28, in profiled_func + return func(cls, *args, **kw) + File "/home/mwh/canonical/checkouts/init-stack-pull/lib/canonical/testing/layers.py", line 1535, in testTearDown + LayerProcessController.postTestInvariants() + File "/home/mwh/canonical/checkouts/init-stack-pull/lib/canonical/testing/profiled.py", line 28, in profiled_func + return func(cls, *args, **kw) + File "/home/mwh/canonical/checkouts/init-stack-pull/lib/canonical/testing/layers.py", line 1389, in postTestInvariants + if cls.appserver.poll() is not None: + AttributeError: 'NoneType' object has no attribute 'poll' + +Solution +~~~~~~~~ + +Probably one of "make build", "make schema" or "killing left over librarians". + +``AttributeError: 'thread._local' object has no attribute 'interaction'`` +------------------------------------------------------------------------- + +Example: + +.. code:: pytb + + Error in test lp.registry.tests.test_distroseries.TestDistroSeriesGetQueueItems.test_get_queue_items + Traceback (most recent call last): + File "/home/cjwatson/src/canonical/launchpad/lp-branches/queue-filter-source-bug-33700/lib/lp/testing/__init__.py", line 322, in run + testMethod() + File "/home/cjwatson/src/canonical/launchpad/lp-branches/queue-filter-source-bug-33700/lib/lp/registry/tests/test_distroseries.py", line 261, in test_get_queue_items + pub_source = self.getPubSource(sourcename='alsa-utils') + File "/home/cjwatson/src/canonical/launchpad/lp-branches/queue-filter-source-bug-33700/lib/lp/soyuz/tests/test_publishing.py", line 159, in getPubSource + spn = getUtility(ISourcePackageNameSet).getOrCreateByName(sourcename) + AttributeError: 'thread._local' object has no attribute 'interaction' + + +Solution +~~~~~~~~ + +Call ``login()``. This error often happens when trying to +use core Launchpad objects when not logged in. + +Most of the time ``login(ANONYMOUS)`` is good enough. ``login\`` & +``ANONYMOUS`` should be imported from ``lp.testing``. + +If you get this error when trying to use the ``LaunchpadObjectFactory``, +you should consider making your test a subclass of +``TestCaseWithFactory``. diff --git a/explanation/feature-flags.rst b/explanation/feature-flags.rst new file mode 100644 index 0000000..8eae069 --- /dev/null +++ b/explanation/feature-flags.rst @@ -0,0 +1,340 @@ +Feature Flags +============= + +.. include:: ../includes/important_not_revised.rst + +**FeatureFlags allow Launchpad's configuration to be changed while it's +running, and for particular features or behaviours to be exposed to only +a subset of users or requests.** + +Please note, that any changes to feature flags need to be recorded on +https://wiki.canonical.com/InformationInfrastructure/OSA/LaunchpadProductionStatus + +Key points +---------- + +- Guard new potentially-dangerous or controversial features by a flag. +- Make sure the documentation is clear enough to make sense to a LOSA + in a high-pressure situation; **don't assume** they will be familiar + with the detailed implementation of the feature. + +Scenarios +--------- + +- Dark launches (aka embargoes: land code first, turn it on later) +- Closed betas +- Scram switches (eg "omg daily builds are killing us, make it stop") +- Soft/slow launch (let just a few users use it and see what happens) +- Site-wide notification +- Show an 'alpha', 'beta' or 'new!' badge next to a UI control, then + later turn it off without a new rollout +- Show developer-oriented UI only to developers (eg the query count) +- Control page timeouts (or other resource limits) either per page id, + or per user group +- Set resource limits (eg address space cap) for jobs. + +Concepts +-------- + +A **feature flag** has a string name, and has a dynamically-determined +value within a particular context such as a web or api request. The +value in that context depends on determining which **scopes** are +relevant to the context, and what **rules** exist for that flag and +scopes. The rules are totally ordered and the highest-prority rule +determines the relevant value. + +Flags values are strings; or if no value is specified, \`None`. (If an +empty value is specified, the flag's value is the empty string). + +For a list of available flags and scopes see +https://launchpad.net/+feature-info + +Priority +-------- + +Priority is exposed as an integer that gives a total order across all +rules for a particular flag. The numerically highest priority wins. `For +example `__ with these +rules + +:: + + hard_timeout team:admins 1 18000 + hard_timeout default 0 15000 + +the first rule has a higher priority (1 > 0). So that rule is evaluated +first, and it will match for anyone in ~admins. If that doesn't match, +the second is evaluated and it is always true. So admins get a 18s +timeout, and everyone else 15s. + +Operations +---------- + +A change to a flag in production counts as a production change: it is +made by IS on request. Make the change `on the appropriate wiki +page `__ +(sorry, company internal), including `an approval per the usual +policy `__, +and then ask in +`~launchpad `__ +or ``#launchpad-dev``. + +Feature rules are loosely coupled to code changes: you can activate +rules before the code they control is live. + +Web interface +------------- + +- https://launchpad.net/+feature-rules shows the currently active + rules. This is visible to ~launchpad (developers etc) and writable by + losas +- https://launchpad.net/+feature-info describes the available features + and scopes. + +Debugging +--------- + +A html comment at the bottom of rendered pages describes which features +were looked up, and which scopes were consulted to make that decision. +This doesn't include features that could be active but aren't relevant +to the page, or scopes that may be active but aren't relevant to +deciding the value of the features. + +Performance +----------- + +Feature flags are designed and intended to be fast enough that they can +be used as much as is useful within reason. The result of a flag and of +a scope is checked at most once per request. + +If the page does not check any flags, no extra work will be done. The +first time a page checks a flag, all the rules will be read from the +database and held in memory for the duration of the request. + +Scopes may be expensive in some cases, such as checking group +membership. Whether a scope is active or not is looked up the first time +it's needed within a particular request. + +Naming conventions +------------------ + +Flag naming +~~~~~~~~~~~ + +Flags should be named as + +**``area.feature.effect``** + +where each of the parts is a legal Python name (so use underscores to +join words, not dashes.) + +The **area** is the general area of Launchpad this relates to: eg +'code', 'librarian', ... + +The **feature** is the particular thing that's being controlled, such as +'recipes' or 'render_time'. + +The **effect** is typically 'enabled', 'visible', or 'timeout'. These +should always be in the positive sense, not 'disabled'. If timeouts are +given, they should be in seconds (decimals can be given in the value.) + +Scope naming +~~~~~~~~~~~~ + +Scopes are matched using a simple regexp and for those that take +parameters they are separated by a colon, e.g. ``team:admins``. + +There is no way at present to give a rule that checks multiple scopes or +any other boolean conditions. You need to either choose one to match +first, or add a new scope that matches just what you need, or extend the +feature flag infrastructure to evaluate boolean expressions. + +Reading a feature flag +---------------------- + +- Python code: lp.services.features.getFeatureFlag(name) => value + +- TAL code: hello world!" + +.. note:: + + ``features/name`` may not work! If you get a ``KeyError`` for + ``features``, try ``request/features/name`` instead. + +You can conditionally show some text like this + +:: + + + +  •  + Take our survey! + + +You can use the built-in TAL feature of prepending ``not:`` to the condition, +and for flags that have a value you could use them in ``tal:replace`` or +``tal:attributes``. + +If you just want to simply insert some text taken from a feature, say +something like + +.. code:: + + Message of the day: ${motd.text} + +Templates can also check whether the request is in a particular scope, +but before using this consider whether the code will always be bound to +that scope or whether it would be more correct to define a new feature: + +:: + +

+ Staging server: all data will be discarded daily!

+ +Boolean values +-------------- + +Frequently it is desired to have a boolean feature flag that can be used +to toggle something on or off. + +Decide what the default should be with the flag unset and this should be +the \`False\` value of the boolean, so name the flag accordingly. + +Then when checking the value do a bool() of the return value and use +that as the value of the flag. + +This means that unset and the empty string are \`False\` and anything +else is \`True\` (note that this means that "false", "False", "off", 0, +etc. all mean \`True`) + +For example + +:: + + if getFeatureFlag('soyuz.frobble_the_wotsits.enabled'): + wotsit.frobble() + +Adding and documenting a new feature flag +----------------------------------------- + +If you introduce a new feature flag, as well as reading it from +whereever is useful, you should also: + +- Add a section in lib/lp/services/features/flags.py flag_info + describing the flag, including documentation that will make sense to + people not intimately involved with development of the feature. For + example: + +:: + + # This table of flag name, value domain, and prose documentation is used to + # generate the web-visible feature flag documentation. + flag_info = sorted([ + ('code.recipes_enabled', + 'boolean', + 'enable recipes', + ''), + +The last item in that list is descriptive, not prescriptive: it +*documents the code's default behavior* if no value is specified. The +flag's value will still read as None if no value is specified, and +setting it to an empty value still returns the empty string. + +Adding a new scope controller +----------------------------- + +Add a new class in + +:: + + lib/lp/services/features/scopes.py + +and make sure it's in + +:: + + HANDLERS + +in that file. (You'll normally do this by adding it to + +:: + + WEBAPP_SCOPE_HANDLERS + +and/or + +:: + + SCRIPT_SCOPE_HANDLERS + +depending on whether it applies to webapp requests, scripts, or both). + +Testing +------- + +``FeatureFixture`` uses the testtools fixtures API to hook into your code. When +it is installed on a TestCase object, the fixture will be automatically torn +down and reset between tests, restoring all of the originally set flags. + + +.. note:: + + There is one gotcha: all existing flags are wiped out by the + fixture for the duration of the test. If you want them to carry over, + you need to do so yourself.* + +You can use the fixture three different ways: + +- With the ``TestCase.useFixture()`` method +- As a context manager, using the ``with`` statement +- By directly calling a fixture instance's ``setUp()`` and ``cleanUp()`` methods + +Here is some sample code demonstrating each: + +.. code:: python + + from lp.services.features.testing import FeatureFixture + from lp.services.features import getFeatureFlag + + + class FeatureTestCase(TestCase): + + layer = DatabaseFunctionalLayer # Features need the database for now + + def test_useFixture(self): + # You can use the fixture with the useFixture() TestCase method: + self.useFixture(FeatureFixture({'reds': 'on'})) + self.assertEqual('on', getFeatureFlag('reds')) + + def test_with_context_manager(self): + # Or as a context manager: + with FeatureFixture({'blues': None}): + self.assertEqual(None, getFeatureFlag('blues')) + + def test_setUp_and_cleanUp(self): + # You can call a fixture's setUp() and cleanUp() directly. + # This is good for use in doctests. + flags = FeatureFixture({'greens': 'mu'}) + flags.setUp() + self.addCleanup(flags.cleanUp) # or use a try/finally + +For more details on using the fixture and other feature flag utilities, +check the module docs in ``lib/lp/services/features/__init__.py``. + +For sample code, check: + +- ``lib/lp/services/features/testing.py`` +- ``lib/lp/services/features/tests/test_helpers.py`` +- ``$ grep -r FeatureFixture lib/lp/`` + +Tips and traps +-------------- + +- `When you soft launch a feature limited to + launchpad-beta `__ + or some other group, remember that it won't get any testing from + anonymous users. + +See also +-------- +- `bugs tagged feature-flags `__ diff --git a/explanation/hacking.rst b/explanation/hacking.rst new file mode 100644 index 0000000..1a4efb5 --- /dev/null +++ b/explanation/hacking.rst @@ -0,0 +1,867 @@ +Hacking +======= + +.. include:: ../includes/important_not_revised.rst + +.. note:: + + Want to navigate the source tree? Look at :doc:`Navigating `. + +Python programming +------------------ + +Which version of Python should I target? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Currently, Launchpad requires Python 3.5. + +How should I format my docstrings? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +First of all, thank you for writing docstrings. They make the code much easier +to follow for the people that didn't originally write it. +To answer the question, you have the following options. + +- A single short sentence. +- A single short sentence, blank line, further explanation. +- A single short sentence, blank line, rst-style explanation of arguments and + return value. +- A single short sentence, blank line, further explanation with rst-style + explanation of arguments and return value. + +You may include a short doctest in the further explanation. +See the examples in the `Epydoc documentation`_. +We're using the **rst** or **ReStructuredText** format, and not using the +**Javadoc** or **Epytext** formats. + +.. _Epydoc documentation: http://epydoc.sourceforge.net/fields.html + +See also: `PEP-8, Style Guide for Python Code`_ and `Docstring Conventions`_. + +Note that we're not using the full expressiveness of `ReStructuredText`, so [[http://python.org/peps/pep-0287.html|PEP-287]] doesn't apply. + +.. _PEP-8, Style Guide for Python Code: http://python.org/peps/pep-0008.html +.. _Docstring Conventions: http://python.org/peps/pep-0257.html + +How should I use assertions in Launchpad code? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + See XXX AssertionsInLaunchpad. + +What exceptions am I allowed to catch? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +See XXX ExceptionGuidelines. + +What exception should I raise when something passed into an API isn't quite right? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In short, never raise ``ValueError``, ``NameError`` or ``TypeError``, and avoid +subclassing these exceptions as well. The full instructions are at +XXX ExceptionGuidelines. + +In the case of ``NotFoundError``, if you are going to catch this specific error +in some other code, and then take some corrective action or some logging +action, then seriously consider creating your own subclass. +This allows your code to handle exactly the situation that you expect, and not +be tricked into handling ``NotFoundErrors`` raised by code several levels in. + +When writing docstrings, always think whether it makes things clearer to say +which exceptions will be raised due to inappropriate values being passed in. + +I have a self-posting form which doesn't display the updated values after it's submitted. What's wrong? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For now, all self-posting forms have to call +``lp.services.database.sqlbase.flush_database_updates()`` after processing the +form. + +.. "Is that still relevant today? In particular, is that still relevant with + LaunchpadFormView?" -- DavidAllouche <> + +I need to define a database class that includes a column of dbschema values. How do I do this? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Use an XXX EnumCol. + +I have received a security proxied list from some API. I need to sort it, but the ``sort()`` method is forbidden. What do I do? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When you get a list from a security proxied object, that list is protected +against being altered. This is important, because you don't know who else might +have a reference to that list. + +When programming in Python generally, it is a good idea to make a copy of a +list you get from somewhere before altering it. +The security proxies just enforce this good practice. + +You can make a copy of the list by using the ``list`` constructor. +Here is an example, using the launchpad API. + +.. code-block:: python + + members = getUtility(ITeamParticipationSet).getAllMembers() + members = list(members) # Get a mutable list. + members.sort() + + +You can also use the ``sorted`` builtin to do this. + +.. code-block:: python + + members = sorted(getUtility(ITeamParticipationSet).getAllMembers()) + +SQL Result Set Ordering +~~~~~~~~~~~~~~~~~~~~~~~ + +If the ordering of an SQL result set is not fully constrained, then your tests +should not be dependent on the natural ordering of results in the sample data. + +If Launchpad does not depend on the ordering of a particular result set, then +that result set should be sorted within the test so that it will pass for any +ordering. + +As a general rule, the result sets whose order we want to test are the ones +that are displayed to the user. These should use an ordering that makes sense +to the user, and the tests should ensure that happens. + +How do I use a PostgreSQL stored procedure/expression as the order_by in Storm? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You have to wrap it in ``SQL()``: + +.. code-block:: python + + from storm.expr import Desc, SQL + + store.find(Person).order_by(SQL("person_sort_key(displayname, name)")) + + store.find(Question, SQL("fti @@ ftq('firefox')")).order_by( + Desc(SQL("rank(fti, ftq('firefox'))")) + +How do I generate SQL commands safely? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The safest way is to let Storm's query compiler do it: + +.. code-block:: python + + results = list( + store.find((Person.id, Person.name), Person.display_name == 'Stuart Bishop')) + +If you can't do that, perhaps because you're doing something too complicated +for the query compiler to manage, then the next safest way is to ensure that +all data is passed in as parameters, the way the DB-API intended it: + +.. code-block:: python + + results = list(store.execute( + "SELECT id, name FROM Person WHERE displayname = %s", + params=('Stuart Bishop',))) + +If you need to embed your data in the SQL query itself, there is only one rule +you need to remember - quote your data. Failing to do this opens up the system +to an SQL injection attack, one of the more common and widely known security +holes and also one of the more destructive. Don't attempt to write your own +quote method - you will probably get it wrong. The only two formats you can use +are %d and %s, and %s should always be escaped, *no exceptions!* + +.. code-block:: python + + from lp.services.database.sqlbase import quote + results = list(store.execute( + "SELECT id, name FROM Person WHERE displayname = %s" % quote("Stuart Bishop"))) + + store.execute("SELECT * FROM Person WHERE name = %s" % quote( + "'; drop table person cascade; insert into person(name) values ('hahaloser')")) + + +The second command in the previous example demonstrates a simple argument that +might be passed in by an attacker. + +What date and time related types should I use? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +See XXX DatetimeUsageGuide for information on what types to use, and how the +Python ``datetime`` types relate to database types through Storm. + +Python segfaulted. What should I do? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Python programs should never segfault, but it can happen if you trigger a bug +in an extension module or the Python core itself. +Since a segfault won't give you a Python traceback, it can be a bit daunting +trying to debug these sort of problems. +If you run into this sort of problem, tell the list. + +See XXX DebuggingWithGdb for some tips on how to narrow down where a segfault +bug like this is occurring. In some cases you can even get a Python stack +trace for this sort of problem. + +I want an object to support ``__getitem__``, what's the best style? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Many Launchpad objects support ``__getitem__``. For example, if you have a +``Foo``, and want to support ``Foo()['bar']``, you will implement +``__getitem__`` for class ``Foo``. +Often, this is used along with ``GetitemNavigation`` in your browser code to +ensure smooth traversal. + +The ``__getitem__`` code itself should not, however, contain the magic that +fetches whatever needs to be fetched. +It should instead call another method that does so, explicitly saying what it +is getting. So for example: + +.. code-block:: python + + @implementer(IFoo) + class FooContentClass: + + def __getitem__(self, name): + """See IFoo.""" + return self.getVersion(name) + + def getVersion(self, name): + """See IFoo.""" + # blah blah blah + return version + +Note that generally, a ``__getitem__`` method should give access to just one +kind of thing. In the example above, it gives you access to versions with the +given name. If your traversal needs to get two kinds of things, for example +versions or changesets, then this is better put in the traversal code in the +``FooNavigation`` class than in the ``__getitem__`` code of the database class. + +Properties +~~~~~~~~~~ + +Properties should be cheap. Using a property can make accessing fields or +calculated results easier, but programmers expect properties to be usable +without consideration of the internal code in the property. As such, a property +that calls expensive routines such as disk resources, examining database joins +or the like will violate this expectation. This can lead to hard to analyze +performance problems because its not clear what is going on unless you are very +familiar with the code. + + Our code routinely contradicts this guideline. I remember I had issues in the + past with TALES traversal when trying to use methods, and had to use + properties instead. We have decorators such as ``@cachedproperty`` to help + with the performance issues. Someone who knows what he talks about should + update this FAQ to match reality. -- DavidAllouche + <> + +Properties should always be used instead of ``__call__()`` semantics in TALES +expressions. The rule is that in view classes, we don't do this: + +.. code-block:: python + + def foo(self): + ... + +We always do this: + +.. code-block:: python + + @property + def foo(self): + ... + +Storm +----- + +Questions about Storm usage. + +XXX StormMigrationGuide document is highly recommended. + +How to retrieve a store? +~~~~~~~~~~~~~~~~~~~~~~~~ + +There are two ways of retrieving a storm 'store', before issuing a query +using native syntax. + +The first format retrieves the Store being used by another object. Use this +method when you don't need to make changes, but want your objects to interact +nicely with objects from an unknown Store (such as a methods parameters): + +.. code-block:: python + + from storm.store import Store + + store = Store.of(object) + result = store.find(Person, Person.name == 'zeca') + +You can also explicitly specify what Store you want to use. You get to choose +the realm (Launchpad main, auth database) and the flavor (master or slave). +If you are retrieving objects that will need to be updated, you need to use +the master. If you are doing a search and we don't mind querying data a few +seconds out of date, you should use the slave. + +.. code-block:: python + + from lp.services.webapp.interfaces import ( + IStoreSelector, MAIN_STORE, AUTH_STORE, + MASTER_FLAVOR, SLAVE_FLAVOR) + + master_store = getUtility(IStoreSelector).get(MAIN_STORE, MASTER_FLAVOR) + master_obj = store.find(Person, Person.name == 'zeca') + slave_store = getUtility(IStoreSelector).get(MAIN_STORE, SLAVE_FLAVOR) + slave_obj = store.find(Person, Person.name == 'zeca') + + +If you don't need to update, but require up-to-date data, you should use +the default flavor. (eg. most views - the object you are viewing might just +have been created). This will retrieve the master unless the load balancer +is sure all changes made by the current client have propagated to the +replica databases. + +.. code-block:: python + + from lp.services.webapp.interfaces import ( + IStoreSelector, MAIN_STORE, AUTH_STORE, DEFAULT_FLAVOR) + + store = getUtility(IStoreSelector).get(MAIN_STORE, DEFAULT_FLAVOR) + result = store.find(Person, Person.name == 'zeca') + +Security, authentication +------------------------ + +See XXX LaunchpadAuthentication + +How can I do get the current user in a database class? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You need to pass it in one of the parameter's method. +You **shouldn't** use the ``ILaunchBag`` for this. In fact, you shouldn't use +the ``ILaunchBag`` in any database class. + +The principle is that the database code must not rely on implicit state, +and by that is meant state not present in the database object's data nor +in the arguments passed to the method call. +Using ``ILaunchBag`` or ``check_permission`` would use this kind of implicit +state. + +How can I protect a method based on one of its parameter? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You can't! Only attribute access can be protected and only the attribute name +and the current user is available when that check is made. + +But there is a common pattern you can use: call in that method another method +on the object passed as parameter. +That method can be appropriately protected using the current security +infrastructure. +Since this auxillary method is part of an object-collaboration scenario, it's +usually a good idea to start these methods with the **notify** verb. +The method is notifying the other object that a collaboration is taking place. + +This will often happen with methods that needs to operate on bugs - since you +usually don't want the operation to be allowed if it's a private bug that the +user doesn't have access to. + +Example: + +.. code-block:: python + + def linkBug(self, bug): + # If this is a private bug that the user doesn't have access, it + # will raise an Unauthorized error. + bug.notifyLinkBug(self) + +Email Notifications +------------------- + +When I need to send a notification for a person/team, how do I know what email address(es) I have to send the notification to? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As you know, persons and teams are meant to be interchangeable in Launchpad, +but when it comes to mail notification the rules differ a bit, see XXX +TeamEmail for more information. In order to mask these rules, there's a helper +function called ``get_contact_email_addresses()`` in +``lib/lp/services/mail/helpers.py`` that you should always use to get the +contact address of a given person/team. +Please note that this function will always return a set of email addresses, +which is perfectly suitable to be passed in to ``simple_sendmail()``. + +Web UI +------ + +How do I perform an action after an autogenerated edit form has been successfully submitted? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You need to write a view's class for this form, if you don't have one already. +In your view's class, add a method ``changed()``. + +.. code-block:: python + + def changed(self): + # This method is called after changes have been made. + +You can use this hook to add a redirect, or to execute some logging, for +example. + +How do I perform an action after an autogenerated add form has been successfully submitted? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You need to write a view's class for this form, if you don't have one already. +In your view's class, add a method ``createAndAdd()``. + +.. code-block:: python + + def createAndAdd(self, data): + # This method is called with the data from the form. + + +You can use this hook to create new objects based on the input from the user. + +How can I redirect a user to a new object just created from an autogenerated add form? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You need to write a view's class for this form, if you don't have one already. +In your view's class, add a method ``nextURL()``. + +.. code-block:: python + + def nextURL(self): + # This method returns the URL where the user should be redirected. + + +How do I format dates and times in page templates? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Let's use some object's ``datecreated`` attribute as an example. + +To format a date, use ``tal:content="context/datecreated/fmt:date``. + +To format a time, use ``tal:content="context/datecreated/fmt:time``. + +To format a date and time, use ``tal:content="context/datecreated/fmt:datetime``. + + +How should I generate notices like "Added Bug #1234" to the top of the page? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: python + + response.addInfoNotification('Added Bug #%(bug_id)d', bug_id=bug.id) + +There are other notification levels (Debug, Info, Notice, Warning, Error), as +per XXX BrowserNotificationMessages. + +Launchpad API +------------- + +How do I add a new celebrity? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +See XXX AddingLaunchpadCelebrity. + +Global Configuration +-------------------- + +How do I add items to launchpad-lazr.conf? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This is done by changing the file ``lib/lp/services/config/schema-lazr.conf``. + +- Items should be created with the following syntax: + + .. code-block:: + + # Comment describing the item. + key_name: default_value + + + ``key_name`` must be a valid Python identifier. + +- Subsections should be created with the following syntax: + + .. code-block:: + + [section_name] + key_name: ... + + ``section_name`` must be a valid Python identifier. + +How are these default values changed for specific environments? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The default configuration values are overridden in +``/configs//launchpad-lazr.conf``. +Notable environments include: +- ``production`` — the production environment; launchpad.net. +- ``staging`` — the staging environment; staging.launchpad.net. +- ``development`` — local development environment, used with ``make run`` and ``make run_all``; launchpad.test. +- ``testrunner`` — the environment used when running tests. + +The syntax for overriding configuration values is the same as the syntax for +defining them. + +How do I use the items listed in launchpad-lazr.conf? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Once your items are added to the ``launchpad-lazr.conf`` file, you may use them +as follows: + +.. code-block:: python + + >>> from lp.services.config import config + >>> # We grab dbname from the default section + >>> dbname = config.dbname + >>> # We grab the dbuser from the gina subsection + >>> dbuser = config.gina.dbuser + + +How can I temporarily override configuration variables in tests? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Use ``lp.testing.TestCase.pushConfig``. + +Testing +------- + +What kind of tests should we use? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +See the XXX TestsStyleGuide for the complete answer. + +Short answer is that we favor the use of doctest in ``lib/lp/*/doc`` for API +documentation and XXX PageTests for use-cases documentation. We use doctests and regular python unittest to complete the coverage. + +How do I run just one doctest file, e.g. ``lib/lp/*/doc/mytest.txt``? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Use the ``--test`` argument to name it: + +.. code-block:: + + bin/test --test=mytest.txt + +What about running just one pagetest story, e.g. ``lib/lp/*/stories/initial-bug-contacts``? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: + + bin/test --test=stories/initial-bug-contacts + +What about running a standalone pagetest, e.g. ``xx-bug-index.txt``? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Like this: + +.. code-block:: + + bin/test --test=xx-bug-index + +And if I want to execute all tests except one? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: + + bin/test '!test_to_ignore' + +How can I examine my test output with PDB? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``bin/test``'s ``-D`` argument is **everyone's** friend. + +If your test raises any exceptions or failures, then the following will open a +pdb shell right where the failure occurred: + +.. code-block:: + + bin/test -D -vvt my.test.name + +Where can I get help on running tests? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Try this: + +.. code-block:: + + bin/test --help + +How can I check test coverage? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The bin/test script has a ``--coverage option`` that will report on code +coverage. + +How can I run only the tests for the page test layer? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: + + bin/test --layer=PageTestLayer + +Where should I put my tests: in a ``test_foo.py`` module, or a ``foo.txt`` doctest file? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You should prefer doctests. A good rule of thumb is that ``test_*.py`` modules +are best for tests that aren't useful for documentation, but instead for +increasing test coverage to obscure or hard-to-reach code paths. + +It is very easy to write test code that says "check foo does bar", without +explaining why. Doctests tend to trick the author into explaining why. + +However resist the temptation to insert tests into the system doctests +(``lib/lp/*/doc/*.txt``) that reduce their usefulness as documentation. +Tests which confuse rather than clarify do not belong here. +To a lesser extent, this also applies to other doctests too. + +How to I setup my tests namespace so I can remove unwanted import statements and other noise? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For XXX DocFileSuite tests, such as the system documentation tests, you can +pass in ``setUp`` and ``tearDown`` methods. You can stuff values into the +namespace using the ``setUp`` method. + + +.. code-block:: python + + from zope.component import getUtility + + def setUp(test): + test.globs['getUtility'] = getUtility + + +Why is my page test failing mysteriously? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This is often due to a bug in the doctest code that means that ellipses +(``...``) don't match blank lines (````). +Inserting blank lines in the right parts of the page test should fix it. + +If you are running a single test and getting odd database failures, chances are +you haven't run make schema. When running a single test the database setup step +is skipped, and you need to make sure you've done it before. + +I'm writing a pagetest in the standalone directory that changes some objects. Will my changes be visible to other pagetests in this directory? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +No. The database is reset between each standalone pagetest. + +Why is my page test not failing when adding extra whitespace chars inside a textarea? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Because by default, the page test ignores them so you don't need to take care +of the indentation. Sometimes, the indentation matters, e.g. inside ``
``
+and ``