You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm attempting to add a Mixpanel source but have been unable to replicate data to snowflake. There are some logs that indicate data was loaded, but I can see no records in Snowflake. The fine all summary on the connection is
Sync Cancelled
6.27 GB | 6,924,629 records extracted | no records loaded | 18h 30m 1s
My most recent attempt after resetting the stream was to load data between April 3, 2024 and April 4, 2024. Mixpanel shows a current count of 5.3M events and the last count in the airbyte logs (line 455663, so should be 244663 in file 2) showed 7.4M records. So something seems off with the filters for the events.
Other strange things I've noticed in the logs include negative hashes like this: 2024-04-03 23:40:55 replication-orchestrator > Could not find the state message with hash -1408807610 in the stagedStatsList. The number of logs with this statement seem to increase as time goes on, so it appears there are more and more attempts to find the hash in the stats list.
Then there is negative memory allocation near the end of the logs (file 2), and an apparent loop to load zero-byte files into snowflake:
2024-04-04 14:12:56 destination > INFO pool-5-thread-3 i.a.c.i.d.r.BaseSerializedBuffer(flush):170 Finished writing data to e5d7cb69-9f21-4cb4-97b9-da379790a80114759769313155521145.csv.gz (0 bytes)
2024-04-04 14:12:56 destination > INFO pool-5-thread-3 i.a.c.i.d.s.AsyncFlush(flush):88 Flushing CSV buffer for stream export (0 bytes) to staging
2024-04-04 14:12:57 destination > INFO pool-5-thread-3 i.a.c.d.j.DefaultJdbcDatabase(lambda$unsafeQuery$1):132 closing connection
2024-04-04 14:12:57 destination > INFO pool-5-thread-3 i.a.i.d.s.SnowflakeInternalStagingSqlOperations(uploadRecordsToStage):108 Successfully loaded records to stage 2024/04/03/19/E75928A5-74AC-47D5-9988-4E90748E4094/ with 0 re-attempt(s)
2024-04-04 14:12:58 destination > INFO pool-5-thread-3 i.a.i.b.d.t.TypeAndDedupeOperationValve(readyToTypeAndDedupe):88 Skipping Incremental Typing and Deduping
2024-04-04 14:12:58 destination > INFO pool-5-thread-3 i.a.c.i.d.r.FileBuffer(deleteFile):109 Deleting tempFile data e5d7cb69-9f21-4cb4-97b9-da379790a80114759769313155521145.csv.gz
2024-04-04 14:12:58 destination > INFO pool-5-thread-3 i.a.c.i.d.a.s.GlobalAsyncStateManager(flushStates):159 Flushing states
2024-04-04 14:12:58 destination > INFO pool-5-thread-3 i.a.c.i.d.a.s.GlobalAsyncStateManager(flushStates):213 Flushing states complete
2024-04-04 14:12:58 destination > INFO pool-5-thread-3 i.a.c.i.d.a.GlobalMemoryManager(free):78 Freeing 0 bytes..
2024-04-04 14:12:58 destination > INFO pool-5-thread-3 i.a.c.i.d.a.GlobalMemoryManager(free):83 Freed more memory than allocated (0 of -125237279)
2024-04-04 14:12:58 destination > INFO pool-5-thread-3 i.a.c.i.d.a.GlobalMemoryManager(free):78 Freeing 0 bytes..
2024-04-04 14:12:58 destination > INFO pool-5-thread-3 i.a.c.i.d.a.GlobalMemoryManager(free):83 Freed more memory than allocated (0 of -125237279)
2024-04-04 14:12:58 destination > INFO pool-5-thread-3 i.a.c.i.d.a.FlushWorkers(flush$lambda$6):184 Flush Worker (ba637) -- Worker finished flushing. Current queue size: 0
2024-04-04 14:13:16 destination > INFO pool-4-thread-1 i.a.c.i.d.a.b.BufferManager(printQueueInfo):106 [ASYNC QUEUE INFO] Global: max: 742.41 MB, allocated: -125237279 bytes (-119.43557643890381 MB), % used: -0.16087630786904583 | Queue `export`, num records: 0, num bytes: 0 bytes, allocated bytes: 0 bytes | State Manager memory usage: Allocated: -125237279 bytes, Used: -135723039 bytes, percentage Used 1.083727
2024-04-04 14:13:16 destination > INFO pool-7-thread-1 i.a.c.i.d.a.FlushWorkers(printWorkerInfo):130 [ASYNC WORKER INFO] Pool queue size: 0, Active threads: 0
Is there any chance these negatives are coming from an unsigned int overflow?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello all,
Source: Mixpanel v2.2.0
Destination: Snowflake v3.6.6
Airbyte Version: v0.56.50 deployed on kubernetes via helm chart
I'm attempting to add a Mixpanel source but have been unable to replicate data to snowflake. There are some logs that indicate data was loaded, but I can see no records in Snowflake. The fine all summary on the connection is
My most recent attempt after resetting the stream was to load data between April 3, 2024 and April 4, 2024. Mixpanel shows a current count of 5.3M events and the last count in the airbyte logs (line 455663, so should be 244663 in file 2) showed 7.4M records. So something seems off with the filters for the events.
Other strange things I've noticed in the logs include negative hashes like this:
2024-04-03 23:40:55 replication-orchestrator > Could not find the state message with hash -1408807610 in the stagedStatsList
. The number of logs with this statement seem to increase as time goes on, so it appears there are more and more attempts to find the hash in the stats list.Then there is negative memory allocation near the end of the logs (file 2), and an apparent loop to load zero-byte files into snowflake:
Is there any chance these negatives are coming from an unsigned int overflow?
Thank you in advance for any help.
airbyte_log_2.txt
airbyte_log_1.txt
Beta Was this translation helpful? Give feedback.
All reactions