-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple processes running concurrently using libmodbus over Modbus-Rtu #731
Comments
You should tell us more about the Modbus server device, the used hardware (RS-232/RS-485...) on the Raspi side, what UART drivers are involved etc. |
Yes you are right on the details of the issue. I have looked probable sources of the problem in different setups. I also agree on whether the problem is a libmodbus issue but on different OS and hardware behavior of the implementation may vary. I will write my real setup to give more insight to the issue. My first comment setup was tried to see whether there is a software issue on the processes. Kernel release version: 5.15.84-v7l+ SetupThis is the actual setup I want to use on my final project. There are two programs running on different UART ports(both PL011 UARTs) of RPI4 with baud rates of 115200 and 2000000. I will call the process with 115200 baud rate as 'Process 1' and process with 2000000 baud rate as 'Process 2' to make statements more clear. Process 1
Step 1: Available RS485 slaves are detected. Process 2
Baud rate: 115200 #Tests
These logs may be unnecessary but I feel like it may lead to to source of the problem with logs given in other cases.
Error log for Process 1 while both processes are running concurrently
Same timeouts may seem odd for different baud rates but I wanted to put with the ones I have tested. Increasing timeout for the process reduces communication errors but 1000 cycles duration increases proportional to timeout with less communication errors. Mentioning I am not that experienced with Linux side , I think that problem is occurring due to interrupt and process scheduling latencies within OS. I haven't tried my setup on different OS but the problem doesn't seem to be hardware issue. Times calculated for checking related file descriptors may be changed on libmodbus side for this case. |
I know this may sound crazy but, as a test, slow everything down. Adjust all your baud rates down to 19.2 or similar and adjust your timeouts up accordingly.
What this does is reduces/eliminates many hardware/connection/cabling and timing issues. If everything works satisfactorily at a much slower speed, you then know it is probably not a software issue.
Then you can start tweaking the speeds up again to find the point where there is an issue again.
One of your baud rates is 2MB! At this speed, many other issues can creep into the troubleshooting process.
From: Hüseyin ***@***.***>
Sent: Thursday, December 28, 2023 7:57 AM
To: stephane/libmodbus ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [stephane/libmodbus] Multiple processes running concurrently using libmodbus over Modbus-Rtu (Issue #731)
Yes you are right on the details of the issue. I have looked probable sources of the problem in different setups. I also agree on whether the problem is a libmodbus issue but on different OS and hardware behavior of the implementation may vary.
I will write my real setup to give more insight to the issue. My first comment setup was tried to see whether there is a software issue on the processes.
Setup
This is the actual setup I want to use on my final project. There are two programs running on different UART ports(both PL011 UARTs) of RPI4 with baud rates of 115200 and 2000000. I will call the process with 115200 baud rate as 'Process 1' and process with 2000000 baud rate as 'Process 2' to make statements more clear.
Process 1
"Baudrate": 2000000,
"Byte Timeout": {
"sec": 0,
"usec":50
},
"Response Timeout": {
"sec": 0,
"usec": 20000
},
Step 1: Available RS485 slaves are detected.
Step 2: Continuous Write-Read Partition in an infinite loop. Write operation to 16 registers with 'modbus_write_registers' and read operation from 1 register.
Process 2
"Baudrate": 115200,
"Byte Timeout": {
"sec": 0,
"usec":50
},
"Response Timeout": {
"sec": 0,
"usec": 20000
},
Baud rate: 115200
Step 1: Available RS485 slaves are already known. 100 - 100 - 26 length read registers operations is executed in three steps.
Step 2: Continuous Write-Read Partition in an infinite loop. Write operation to 15 registers with 'modbus_write_registers' and read operation from 1 register.
#Tests
//Single run of Process 1 - Step 2: logs in a continuous loop.
Total number of scan cycles -> 7000
Slot Number: 1, Successful: 7000, Failed: 0
Duration for 1000 main cycles: 3.121561 seconds
Total number of scan cycles -> 8000
Slot Number: 1, Successful: 8000, Failed: 0
//Single run of Process 2 - Step 2: logs in a continuous loop.
Duration for 1000 main cycles: 7.027475 seconds
Slot Number: 1, Successful: 2000, Failed: 0
Duration for 1000 main cycles: 7.039133 seconds
Slot Number: 1, Successful: 3000, Failed: 0
//Process 1 logs when both process are running
Total number of scan cycles -> 3000
Slot Number: 1, Successfull: 1268, Failed: 1732
Duration for 1000 main cycles: 14.515857 seconds
Total number of scan cycles -> 4000
Slot Number: 1, Successful: 1786, Failed: 2214
Duration for 1000 main cycles: 12.842769 seconds
//Process 2 logs when both process are running
Duration for 1000 main cycles: 7.081596 seconds
Slot Number: 1, Successful: 32000, Failed: 0
Duration for 1000 main cycles: 7.081268 seconds
Slot Number: 1, Successful: 33000, Failed: 0
Duration for 1000 main cycles: 7.033406 seconds
Slot Number: 1, Successful: 34000, Failed: 0
These logs may be unnecessary but I feel like it may lead to to source of the problem with logs given in other cases.
To summarize while each process is running by itself we don't have communication problem but while two processes are running concurrently communication problems start to occur. I tried to read-write different registers sizes and what I've seen as follows for Process 1 ->
16-length register write,1-length register read -> all the errors occur on write operation and read operation functions properly
16-length register write,16-length register read -> errors occur on both r-w operations in similar ratios.
1-length register write,16-length register read -> less errors occur on both r-w operations. Most of the errors occur on read operations. Logs for this case is given below: 'r' and 'w' log means on which operation error occurred.
Duration for 1000 main cycles: 3.587441 seconds
rrwrwrwrwrwrwrrrrrrTotal number of scan cycles -> 43000
Slot Number: 1, Successful: 42816, Failed: 184
Duration for 1000 main cycles: 3.588082 seconds
rrrrrTotal number of scan cycles -> 44000
Slot Number: 1, Successfull: 43811, Failed: 189
Duration for 1000 main cycles: 3.537136 seconds
rrrrrrrrrrrTotal number of scan cycles -> 45000
Slot Number: 1, Successful: 44800, Failed: 200
Error log for Process 1 while both processes are running concurrently
Waiting for a confirmation...
<01><03><02><00><00><B8><44>
[01][10][00][0A][00][10][20][00][01][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][92][A9]
Waiting for a confirmation...
<01><10><00><0A><00><10><E1><C7>
[01][03][00][1A][00][01][A5][CD]
Waiting for a confirmation...
<01><03><02><00><00><B8><44>
[01][10][00][0A][00][10][20][00][01][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][92][A9]
Waiting for a confirmation...
ERROR Connection timed out: select
w[01][03][00][1A][00][01][A5][CD]
Waiting for a confirmation...
<01><03><02><00><00><B8><44>
[01][10][00][0A][00][10][20][00][01][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][92][A9]
Waiting for a confirmation...
ERROR Connection timed out: select
w[01][03][00][1A][00][01][A5][CD]
Waiting for a confirmation...
<01><03><02><00><00><B8><44>
[01][10][00][0A][00][10][20][00][01][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][00][92][A9]
Same timeouts may seem odd for different baud rates but I wanted to put with the ones I have tested. Increasing timeout for the process reduces communication errors but 1000 cycles duration increases proportional to timeout with less communication errors.
Mentioning I am not that experienced with Linux side , I think that problem is occurring due to interrupt and process scheduling latencies within OS. I haven't tried my setup on different OS but the problem doesn't seem to be hardware issue. Times calculated for checking related file descriptors may be changed on libmodbus side for this case.
—
Reply to this email directly, view it on GitHub <#731 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFDSALDPLYFIHB5GHZAI43YLVUC3AVCNFSM6AAAAABBEBMBYKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZRGE2DGNZYGY> .
You are receiving this because you are subscribed to this thread. <https://github.com/notifications/beacon/AAFDSANVO56POHAOKSK4JW3YLVUC3A5CNFSM6AAAAABBEBMBYKWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTPQ5RWU.gif> Message ID: ***@***.*** ***@***.***> >
|
The timeouts are really tight and with respect to the baudrates, I wonder whether the whole approach with Linux as none-RTOS system makes sense. As already mentioned, I'd also try to slow down. And since you use RS-485, you can try to connect a 3rd observer system to each RS-485 line and let is sniff into the Modbus traffic. So you can at least see whether the Modbus server's replies are correct and complete and so on... |
To let everyone know, issue is related to RPi4 UART driver. Changes on related chip's driver must be done. There are related discussions on Raspberry forum. Thank you for the help @mhei . |
libmodbus version
OS and/or distribution
Environment
Description
In both of the processes libmodbus library is used. When either of the processes runs by itself, I can make read-write operations without any problem. However when running two processes are run concurrently(with 2 seconds delay), the process started later on can't execute long register operations(120 registers , 2 bytes each) . "Connection timed out" error is returned from modbus_read_registers or modbus_write_registers. When I dig into the related functions, I realized Src/modbus-rtu.c -> _modbus_rtu_select function can not handle to let make operations on the related file descriptors. Changing the byte timeout and response timeouts just delays the time error is returned. and still either of the processes doesn't function properly. These two processes are scanning the devices with baudrate of 115200. When I change the baudtate for one of the devices to 2000000, one process still functions properly, the other process can read long register however on write operations(regardless of the register length ) error ratio becomes %50 on a continuous operation cycle.
I haven't seen any previous issues using libmodbus on concurrently running two processes. I have been trying to figure out the issue or the solution for a week but I couldn't move forward more. I can give more details in case of any questions but issue seems to be software related but not hardware. To mention when tcp/ip is used in either or both of the processes, no issue is observed.
libmodbus output with debug mode enabled
For the process started later on.
Reading device registers!
[01][03][00][00][00][64][44][21]
Waiting for a confirmation...
ERROR Connection timed out: select
Connection timed out
<01><03><01><06><02><00><49><52><2D><4F><33><32><31><30><30><30><30><30><00><00><00><00><00><80><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><00><7C>Devices' identification process has failed!
The text was updated successfully, but these errors were encountered: