Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the bug to write data into juicefs with spark-sql #5088

Closed
GoodJeek opened this issue Aug 15, 2024 · 4 comments
Closed

the bug to write data into juicefs with spark-sql #5088

GoodJeek opened this issue Aug 15, 2024 · 4 comments
Assignees
Labels
needs-more-info This issue requires more information to address

Comments

@GoodJeek
Copy link

GoodJeek commented Aug 15, 2024

What happened:
when insert a few amounts of datas overwrite into a table with spark-sql,it normally worked well, but when the data size exceed a specific value such as 5000, ti could not write data into the table completely and some error logs were printed

the logs as below:

spark driver log:

24/08/15 03:03:53 INFO ShuffleWriteClientImpl: Successfully send heartbeat to Coordinator grpc client ref to 10.39.215.217:19999
24/08/15 03:03:53 INFO ShuffleWriteClientImpl: Successfully send heartbeat to Coordinator grpc client ref to 10.39.215.218:19999
24/08/15 03:03:53 INFO RssShuffleManager: Finish send heartbeat to coordinator and servers
24/08/15 03:03:56 WARN TaskSetManager: Lost task 0.0 in stage 14.0 (TID 131) (10.42.0.245 executor 3): org.apache.spark.SparkException: Task failed while writing rows.
at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:500)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:321)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$16(FileFormatWriter.scala:229)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: jfs://hive/warehouse/item_01/.hive-staging_hive_2024-08-15_02-58-50_459_6630558984109284475-2/-ext-10000/_temporary/0/_temporary/attempt_202408150258544842979019484022797_0014_m_000000_131/part-00000-a208cb54-a78d-43ef-81d7-abc0e871fcb3-c000
at io.juicefs.JuiceFileSystemImpl.error(JuiceFileSystemImpl.java:281)
at io.juicefs.JuiceFileSystemImpl.access$600(JuiceFileSystemImpl.java:76)
at io.juicefs.JuiceFileSystemImpl$FSOutputStream.close(JuiceFileSystemImpl.java:1018)
at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
at io.juicefs.JuiceFileSystemImpl$BufferedFSOutputStream.close(JuiceFileSystemImpl.java:1139)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:77)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat$1.close(HiveIgnoreKeyTextOutputFormat.java:99)
at org.apache.spark.sql.hive.execution.HiveOutputWriter.close(HiveFileFormat.scala:162)
at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.releaseCurrentWriter(FileFormatDataWriter.scala:64)
at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.releaseResources(FileFormatDataWriter.scala:75)
at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.commit(FileFormatDataWriter.scala:105)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:305)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1525)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:311)
... 9 more
24/08/15 03:03:56 INFO TaskSetManager: Starting task 0.1 in stage 14.0 (TID 132) (10.42.3.149, executor 2, partition 0, ANY, 4472 bytes) taskResourceAssignments Map()
24/08/15 03:03:56 INFO BlockManagerInfo: Added broadcast_17_piece0 in memory on 10.42.3.149:38447 (size: 133.4 KiB, free: 3.3 GiB)
24/08/15 03:03:56 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 2 to 10.42.3.149:44030

spark executor log as below:

caused by: expected element type but have (after 11 tries) [writer.go:118]
24/08/15 03:20:27 WARN JuiceFileSystemImpl: 2024/08/15 03:20:27.101260 juicefs[14] : Upload chunks/4/4247/4247266_0_2616981: SerializationError: failed to unmarshal error message
status code: 417, request id: , host id:
caused by: UnmarshalError: failed to unmarshal error message
00000000 3c 21 44 4f 43 54 59 50 45 20 68 74 6d 6c 20 50 |.<html|
00000060 3e 3c 68 65 61 64 3e 0a 3c 6d 65 74 61 20 68 74 |>.<meta ht|
00000070 74 70 2d 65 71 75 69 76 3d 22 43 6f 6e 74 65 6e |tp-equiv="Conten|
00000080 74 2d 54 79 70 65 22 20 63 6f 6e 74 65 6e 74 3d |t-Type" content=|
00000090 22 74 65 78 74 2f 68 74 6d 6c 3b 20 63 68 61 72 |"text/html; char|
000000a0 73 65 74 3d 75 74 66 2d 38 22 3e 0a 3c 74 69 74 |set=utf-8">.<tit|
000000b0 6c 65 3e 45 52 52 4f 52 3a 20 54 68 65 20 72 65 |le>ERROR: The re|
000000c0 71 75 65 73 74 65 64 20 55 52 4c 20 63 6f 75 6c |quested URL coul|
000000d0 64 20 6e 6f 74 20 62 65 20 72 65 74 72 69 65 76 |d not be retriev|
000000e0 65 64 3c 2f 74 69 74 6c 65 3e 0a 3c 73 74 79 6c |ed</title>.<styl|
000000f0 65 20 74 79 70 65 3d 22 74 65 78 74 2f 63 73 73 |e type="text/css|
00000100 22 3e 3c 21 2d 2d 20 0a 20 2f 2a 0a 20 53 74 79 |"><!-- . /. Sty|
00000110 6c 65 73 68 65 65 74 20 66 6f 72 20 53 71 75 69 |lesheet for Squi|
00000120 64 20 45 72 72 6f 72 20 70 61 67 65 73 0a 20 41 |d Error pages. A|
00000130 64 61 70 74 65 64 20 66 72 6f 6d 20 64 65 73 69 |dapted from desi|
00000140 67 6e 20 62 79 20 46 72 65 65 20 43 53 53 20 54 |gn by Free CSS T|
00000150 65 6d 70 6c 61 74 65 73 0a 20 68 74 74 70 3a 2f |emplates. http:/|
00000160 2f 77 77 77 2e 66 72 65 65 63 73 73 74 65 6d 70 |/www.freecsstemp|
00000170 6c 61 74 65 73 2e 6f 72 67 0a 20 52 65 6c 65 61 |lates.org. Relea|
00000180 73 65 64 20 66 6f 72 20 66 72 65 65 20 75 6e 64 |sed for free und|
00000190 65 72 20 61 20 43 72 65 61 74 69 76 65 20 43 6f |er a Creative Co|
000001a0 6d 6d 6f 6e 73 20 41 74 74 72 69 62 75 74 69 6f |mmons Attributio|
000001b0 6e 20 32 2e 35 20 4c 69 63 65 6e 73 65 0a 2a 2f |n 2.5 License.
/|
000001c0 0a 0a 2f 2a 20 50 61 67 65 20 62 61 73 69 63 73 |../* Page basics|
000001d0 20 2a 2f 0a 2a 20 7b 0a 09 66 6f 6e 74 2d 66 61 | /. {..font-fa|
000001e0 6d 69 6c 79 3a 20 76 65 72 64 61 6e 61 2c 20 73 |mily: verdana, s|
000001f0 61 6e 73 2d 73 65 72 69 66 3b 0a 7d 0a 0a 68 74 |ans-serif;.}..ht|
00000200 6d 6c 20 62 6f 64 79 20 7b 0a 09 6d 61 72 67 69 |ml body {..margi|
00000210 6e 3a 20 30 3b 0a 09 70 61 64 64 69 6e 67 3a 20 |n: 0;..padding: |
00000220 30 3b 0a 09 62 61 63 6b 67 72 6f 75 6e 64 3a 20 |0;..background: |
00000230 23 65 66 65 66 65 66 3b 0a 09 66 6f 6e 74 2d 73 |#efefef;..font-s|
00000240 69 7a 65 3a 20 31 32 70 78 3b 0a 09 63 6f 6c 6f |ize: 12px;..colo|
00000250 72 3a 20 23 31 65 31 65 31 65 3b 0a 7d 0a 0a 2f |r: #1e1e1e;.}../|
00000260 2a 20 50 61 67 65 20 64 69 73 70 6c 61 79 65 64 |* Page displayed|
00000270 20 74 69 74 6c 65 20 61 72 65 61 20 2a 2f 0a 23 | title area /.#|
00000280 74 69 74 6c 65 73 20 7b 0a 09 6d 61 72 67 69 6e |titles {..margin|
00000290 2d 6c 65 66 74 3a 20 31 35 70 78 3b 0a 09 70 61 |-left: 15px;..pa|
000002a0 64 64 69 6e 67 3a 20 31 30 70 78 3b 0a 09 70 61 |dding: 10px;..pa|
000002b0 64 64 69 6e 67 2d 6c 65 66 74 3a 20 31 30 30 70 |dding-left: 100p|
000002c0 78 3b 0a 09 62 61 63 6b 67 72 6f 75 6e 64 3a 20 |x;..background: |
000002d0 75 72 6c 28 27 68 74 74 70 3a 2f 2f 77 77 77 2e |url('http://www.|
000002e0 73 71 75 69 64 2d 63 61 63 68 65 2e 6f 72 67 2f |squid-cache.org/|
000002f0 41 72 74 77 6f 72 6b 2f 53 4e 2e 70 6e 67 27 29 |Artwork/SN.png')|
00000300 20 6e 6f 2d 72 65 70 65 61 74 20 6c 65 66 74 3b | no-repeat left;|
00000310 0a 7d 0a 0a 2f 2a 20 69 6e 69 74 69 61 6c 20 74 |.}../
initial t|
00000320 69 74 6c 65 20 2a 2f 0a 23 74 69 74 6c 65 73 20 |itle /.#titles |
00000330 68 31 20 7b 0a 09 63 6f 6c 6f 72 3a 20 23 30 30 |h1 {..color: #00|
00000340 30 30 30 30 3b 0a 7d 0a 23 74 69 74 6c 65 73 20 |0000;.}.#titles |
00000350 68 32 20 7b 0a 09 63 6f 6c 6f 72 3a 20 23 30 30 |h2 {..color: #00|
00000360 30 30 30 30 3b 0a 7d 0a 0a 2f 2a 20 73 70 65 63 |0000;.}../
spec|
00000370 69 61 6c 20 65 76 65 6e 74 3a 20 46 54 50 20 73 |ial event: FTP s|
00000380 75 63 63 65 73 73 20 70 61 67 65 20 74 69 74 6c |uccess page titl|
00000390 65 73 20 2a 2f 0a 23 74 69 74 6c 65 73 20 66 74 |es /.#titles ft|
000003a0 70 73 75 63 63 65 73 73 20 7b 0a 09 62 61 63 6b |psuccess {..back|
000003b0 67 72 6f 75 6e 64 2d 63 6f 6c 6f 72 3a 23 30 30 |ground-color:#00|
000003c0 66 66 30 30 3b 0a 09 77 69 64 74 68 3a 31 30 30 |ff00;..width:100|
000003d0 25 3b 0a 7d 0a 0a 2f 2a 20 50 61 67 65 20 64 69 |%;.}../
Page di|
000003e0 73 70 6c 61 79 65 64 20 62 6f 64 79 20 63 6f 6e |splayed body con|
000003f0 74 65 6e 74 20 61 72 65 61 20 2a 2f 0a 23 63 6f |tent area /.#co|
00000400 6e 74 65 6e 74 20 7b 0a 09 70 61 64 64 69 6e 67 |ntent {..padding|
00000410 3a 20 31 30 70 78 3b 0a 09 62 61 63 6b 67 72 6f |: 10px;..backgro|
00000420 75 6e 64 3a 20 23 66 66 66 66 66 66 3b 0a 7d 0a |und: #ffffff;.}.|
00000430 0a 2f 2a 20 47 65 6e 65 72 61 6c 20 74 65 78 74 |./
General text|
00000440 20 2a 2f 0a 70 20 7b 0a 7d 0a 0a 2f 2a 20 65 72 | /.p {.}../ er|
00000450 72 6f 72 20 62 72 69 65 66 20 64 65 73 63 72 69 |ror brief descri|
00000460 70 74 69 6f 6e 20 2a 2f 0a 23 65 72 72 6f 72 20 |ption /.#error |
00000470 70 20 7b 0a 7d 0a 0a 2f 2a 20 73 6f 6d 65 20 64 |p {.}../
some d|
00000480 61 74 61 20 77 68 69 63 68 20 6d 61 79 20 68 61 |ata which may ha|
00000490 76 65 20 63 61 75 73 65 64 20 74 68 65 20 70 72 |ve caused the pr|
000004a0 6f 62 6c 65 6d 20 2a 2f 0a 23 64 61 74 61 20 7b |oblem /.#data {|
000004b0 0a 7d 0a 0a 2f 2a 20 74 68 65 20 65 72 72 6f 72 |.}../
the error|
000004c0 20 6d 65 73 73 61 67 65 20 72 65 63 65 69 76 65 | message receive|
000004d0 64 20 66 72 6f 6d 20 74 68 65 20 73 79 73 74 65 |d from the syste|
000004e0 6d 20 6f 72 20 6f 74 68 65 72 20 73 6f 66 74 77 |m or other softw|
000004f0 61 72 65 20 2a 2f 0a 23 73 79 73 6d 73 67 20 7b |are /.#sysmsg {|
00000500 0a 7d 0a 0a 70 72 65 20 7b 0a 20 20 20 20 66 6f |.}..pre {. fo|
00000510 6e 74 2d 66 61 6d 69 6c 79 3a 73 61 6e 73 2d 73 |nt-family:sans-s|
00000520 65 72 69 66 3b 0a 7d 0a 0a 2f 2a 20 73 70 65 63 |erif;.}../
spec|
00000530 69 61 6c 20 65 76 65 6e 74 3a 20 46 54 50 20 2f |ial event: FTP /|
00000540 20 47 6f 70 68 65 72 20 64 69 72 65 63 74 6f 72 | Gopher director|
00000550 79 20 6c 69 73 74 69 6e 67 20 2a 2f 0a 23 64 69 |y listing /.#di|
00000560 72 6d 73 67 20 7b 0a 20 20 20 20 66 6f 6e 74 2d |rmsg {. font-|
00000570 66 61 6d 69 6c 79 3a 20 63 6f 75 72 69 65 72 3b |family: courier;|
00000580 0a 20 20 20 20 63 6f 6c 6f 72 3a 20 62 6c 61 63 |. color: blac|
00000590 6b 3b 0a 20 20 20 20 66 6f 6e 74 2d 73 69 7a 65 |k;. font-size|
000005a0 3a 20 31 30 70 74 3b 0a 7d 0a 23 64 69 72 6c 69 |: 10pt;.}.#dirli|
000005b0 73 74 69 6e 67 20 7b 0a 20 20 20 20 6d 61 72 67 |sting {. marg|
000005c0 69 6e 2d 6c 65 66 74 3a 20 32 25 3b 0a 20 20 20 |in-left: 2%;. |
000005d0 20 6d 61 72 67 69 6e 2d 72 69 67 68 74 3a 20 32 | margin-right: 2|
000005e0 25 3b 0a 7d 0a 23 64 69 72 6c 69 73 74 69 6e 67 |%;.}.#dirlisting|
000005f0 20 74 72 2e 65 6e 74 72 79 20 74 64 2e 69 63 6f | tr.entry td.ico|
00000600 6e 2c 74 64 2e 66 69 6c 65 6e 61 6d 65 2c 74 64 |n,td.filename,td|
00000610 2e 73 69 7a 65 2c 74 64 2e 64 61 74 65 20 7b 0a |.size,td.date {.|
00000620 20 20 20 20 62 6f 72 64 65 72 2d 62 6f 74 74 6f | border-botto|
00000630 6d 3a 20 67 72 6f 6f 76 65 3b 0a 7d 0a 23 64 69 |m: groove;.}.#di|
00000640 72 6c 69 73 74 69 6e 67 20 74 64 2e 73 69 7a 65 |rlisting td.size|
00000650 20 7b 0a 20 20 20 20 77 69 64 74 68 3a 20 35 30 | {. width: 50|
00000660 70 78 3b 0a 20 20 20 20 74 65 78 74 2d 61 6c 69 |px;. text-ali|
00000670 67 6e 3a 20 72 69 67 68 74 3b 0a 20 20 20 20 70 |gn: right;. p|
00000680 61 64 64 69 6e 67 2d 72 69 67 68 74 3a 20 35 70 |adding-right: 5p|
00000690 78 3b 0a 7d 0a 0a 2f 2a 20 68 6f 72 69 7a 6f 6e |x;.}../
horizon|
000006a0 74 61 6c 20 6c 69 6e 65 73 20 2a 2f 0a 68 72 20 |tal lines /.hr |
000006b0 7b 0a 09 6d 61 72 67 69 6e 3a 20 30 3b 0a 7d 0a |{..margin: 0;.}.|
000006c0 0a 2f 2a 20 70 61 67 65 20 64 69 73 70 6c 61 79 |./
page display|
000006d0 65 64 20 66 6f 6f 74 65 72 20 61 72 65 61 20 2a |ed footer area *|
000006e0 2f 0a 23 66 6f 6f 74 65 72 20 7b 0a 09 66 6f 6e |/.#footer {..fon|
000006f0 74 2d 73 69 7a 65 3a 20 39 70 78 3b 0a 09 70 61 |t-size: 9px;..pa|
00000700 64 64 69 6e 67 2d 6c 65 66 74 3a 20 31 30 70 78 |dding-left: 10px|
00000710 3b 0a 7d 0a 0a 0a 62 6f 64 79 0a 3a 6c 61 6e 67 |;.}...body.:lang|
00000720 28 66 61 29 20 7b 20 64 69 72 65 63 74 69 6f 6e |(fa) { direction|
00000730 3a 20 72 74 6c 3b 20 66 6f 6e 74 2d 73 69 7a 65 |: rtl; font-size|
00000740 3a 20 31 30 30 25 3b 20 66 6f 6e 74 2d 66 61 6d |: 100%; font-fam|
00000750 69 6c 79 3a 20 54 61 68 6f 6d 61 2c 20 52 6f 79 |ily: Tahoma, Roy|
00000760 61 2c 20 73 61 6e 73 2d 73 65 72 69 66 3b 20 66 |a, sans-serif; f|
00000770 6c 6f 61 74 3a 20 72 69 67 68 74 3b 20 7d 0a 3a |loat: right; }.:|
00000780 6c 61 6e 67 28 68 65 29 20 7b 20 64 69 72 65 63 |lang(he) { direc|
00000790 74 69 6f 6e 3a 20 72 74 6c 3b 20 7d 0a 20 2d 2d |tion: rtl; }. --|
000007a0 3e 3c 2f 73 74 79 6c 65 3e 0a 3c 2f 68 65 61 64 |></style>.</head|
000007b0 3e 3c 62 6f 64 79 20 69 64 3d 45 52 52 5f 49 4e |><body id=ERR_IN|
000007c0 56 41 4c 49 44 5f 52 45 51 3e 0a 3c 64 69 76 20 |VALID_REQ>.<div |
000007d0 69 64 3d 22 74 69 74 6c 65 73 22 3e 0a 3c 68 31 |id="titles">.<h1|
00000850 65 73 74 3c 2f 62 3e 20 65 72 72 6f 72 20 77 61 |est error wa|
00000860 73 20 65 6e 63 6f 75 6e 74 65 72 65 64 20 77 68 |s encountered wh|
00000930 37 3b 20 6c 69 6e 75 78 3b 20 61 6d 64 36 34 29 |7; linux; amd64)|
00000940 0d 0a 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 |..Content-Length|
00000950 3a 20 32 36 31 36 39 38 31 0d 0a 41 75 74 68 6f |: 2616981..Autho|
00000960 72 69 7a 61 74 69 6f 6e 3a 20 2a 2a 20 4e 4f 54 |rization: ** NOT|
00000970 20 44 49 53 50 4c 41 59 45 44 20 2a 2a 0d 0a 43 | DISPLAYED **..C|
00000980 6f 6e 74 65 6e 74 2d 4d 44 35 3a 20 45 65 34 2f |ontent-MD5: Ee4/|
00000990 41 7a 72 5a 4a 6e 4d 4d 53 7a 34 31 31 71 2f 76 |AzrZJnMMSz411q/v|
00000a70 70 72 6f 62 6c 65 6d 73 20 61 72 65 3a 3c 2f 70 |problems are:</p|
00000fe0 4d 53 7a 34 31 31 71 25 32 46 76 59 77 25 33 44 |MSz411q%2FvYw%3D|
00000ff0 25 33 44 25 30 44 25 30 41 43 6f 6e 74 65 6e 74 |%3D%0D%0AContent|
> caused by: expected element type but have (try 11) [cached_store.go:390]
24/08/15 03:20:27 ERROR JuiceFileSystemImpl: 2024/08/15 03:20:27.101709 juicefs[14] : upload chunk 4247266 (length: 2616981) fail: (max tries) upload block chunks/4/4247/4247266_0_2616981: SerializationError: failed to unmarshal error message
status code: 417, request id: , host id:
caused by: UnmarshalError: failed to unmarshal error message

What you expected to happen:
I hope large amounts of data can be writted into the table correctly ,normally and timely with spark-sql, which store data in juicefs/minio and store metadata in mysql

Environment:

  • JuiceFS version (use juicefs --version) or Hadoop Java SDK version:
  • JuiceFS version 1.1.0
  • Cloud provider or hardware configuration running JuiceFS:
  • OS (e.g cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Object storage (cloud provider and region, or self maintained):
  • Metadata engine info (version, cloud provider managed or self maintained):
  • Network connectivity (JuiceFS to metadata engine, JuiceFS to object storage):
  • Others:
@GoodJeek GoodJeek added the kind/bug Something isn't working label Aug 15, 2024
@zhijian-pro
Copy link
Contributor

What is object storage ?

@zhijian-pro zhijian-pro self-assigned this Aug 19, 2024
@GoodJeek
Copy link
Author

minio

@zhijian-pro
Copy link
Contributor

zhijian-pro commented Aug 23, 2024

Is this error unrelated to the content of the written data?
Does this error necessarily occur whenever the amount of data being written exceeds a certain size?

Is there any other network middleware between juicefs and minio that causes the returned data to be truncated, resulting in formatting errors that cannot be parsed?

@davies davies added needs-more-info This issue requires more information to address and removed kind/bug Something isn't working labels Aug 27, 2024
@zhijian-pro
Copy link
Contributor

image

Resolved, determined to be caused by the user's network set proxy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-more-info This issue requires more information to address
Projects
None yet
Development

No branches or pull requests

3 participants