-
Notifications
You must be signed in to change notification settings - Fork 395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plan migration from Sourcify DB v0 to v1: custom script vs direct SQL script #1635
Comments
With this SQL script we migrate from Some considerations:
-- Begin the transaction
BEGIN;
-- Step 1: Extract and insert unique sources into 'sources' table
WITH extracted_sources AS (
SELECT
cc.id AS compilation_id,
src.key AS path,
src.value #>> '{}' AS content, -- Extracts the text without quotes
digest(src.value #>> '{}' , 'sha256') AS source_hash,
''::bytea AS source_hash_keccak -- Ensure keccak256 function exists
FROM
compiled_contracts cc
CROSS JOIN LATERAL
jsonb_each(cc.sources::jsonb) AS src(key, value)
ORDER BY cc.id -- delete this line for the full migration
LIMIT 100 -- delete this line for the full migration
),
inserted_sources AS (
INSERT INTO sources (
source_hash,
source_hash_keccak,
content
)
SELECT DISTINCT
es.source_hash,
es.source_hash_keccak,
es.content
FROM
extracted_sources es
ON CONFLICT (source_hash) DO NOTHING
RETURNING source_hash
),
inserted_links AS (
INSERT INTO compiled_contracts_sources (
compilation_id,
source_hash,
path
)
SELECT
es.compilation_id,
es.source_hash,
es.path
FROM
extracted_sources es
ON CONFLICT (compilation_id, path) DO NOTHING
RETURNING id
)
-- Main statement to execute the CTEs and retrieve summary counts
SELECT
(SELECT COUNT(*) FROM inserted_sources) AS sources_inserted,
(SELECT COUNT(*) FROM inserted_links) AS links_inserted;
-- Commit the transaction
COMMIT; |
On my machine it took 5 minutes to migrate 383'160 |
Below are the steps of a possible migration from v0 to v1, using the query above, minimizing downtime, and without the need to maintain backward compatibility with v0.
Possible optimization:
|
From what I see you are not deleting the Also not clear to me why you'd shutdown Sourcify shouldn't it be like this?
|
Right, I didn't include this in the process: I think we can keep the table for some time just to be sure that everything is correct.
Also right. I included that step to be clear about the timeframe of the second migration. It can be from TIMESTAMP_START_MIGRATION to TIMESTAMP_RELEASE |
Migration successfully executed on staging! Below you can find a list of steps / problems / fixes we encountered during this process. Migration Process FixesError during SQL script execution:
Environment Weakness:
Staging Execution:
PR Merge:
Schema Adjustment:
Issue Found:
Checked if verification works:
Migration Re-execution:
|
Let's keep this as the issue to track the planning (and execution on staging) and use #1672 to track the execution on the production environment. |
With the new VerA schema (verifier-alliance/database-specs#15), we need to migrate existing
compiled_contracts.sources
to the new tablessources
andcompiled_contracts_sources
. Ensuring data integrity and minimizing downtime are critical for a smooth transition.Evaluate whether to develop a custom js script or to execute a direct migration using an SQL script.
The text was updated successfully, but these errors were encountered: