Incorrect work of examine() in case of halt groups #902

kr-sc · 2023-08-07T09:59:26Z

I launch spike simulator with 4 harts:

spike \
  --rbb-port=7777 \
  --isa=rv64gc \
  --dm-progsize=6 \
  --dm-sba=0 \
  --dm-abstract-rti=0 \
  -p4 \
  -m0x80000000:0xa00000 \
  <some_elf_file>

then the following script:

for LOG in {1..2}; \
  do openocd \
    -c "adapter driver remote_bitbang" \
    -c "remote_bitbang host 127.0.0.1" \
    -c "remote_bitbang port 7777" \
    -c 'gdb_port disabled' \
    -f spike_smp4_rtoshw4.cfg \
    -c "log_output ${LOG}.log" \
    -c init \
    -c targets \
    -c shutdown -d3; \
done;

spike_smp4_rtoshw4.cfg

proc init_targets {} {
	set _CHIPNAME riscv
	jtag newtap $_CHIPNAME cpu -irlen 5 -expected-id 0x10e31913

	set _TARGETNAME_0 $_CHIPNAME.cpu0
	set _TARGETNAME_1 $_CHIPNAME.cpu1
	set _TARGETNAME_2 $_CHIPNAME.cpu2
	set _TARGETNAME_3 $_CHIPNAME.cpu3

	target create $_TARGETNAME_0 riscv -chain-position $_CHIPNAME.cpu -coreid 0 -rtos hwthread
	target create $_TARGETNAME_1 riscv -chain-position $_CHIPNAME.cpu -coreid 1 -rtos hwthread
	target create $_TARGETNAME_2 riscv -chain-position $_CHIPNAME.cpu -coreid 2 -rtos hwthread
	target create $_TARGETNAME_3 riscv -chain-position $_CHIPNAME.cpu -coreid 3 -rtos hwthread

	target smp $_TARGETNAME_0 $_TARGETNAME_1 $_TARGETNAME_2 $_TARGETNAME_3
}

Result of the first iteration:

TargetName         Type       Endian TapName            State
--  ------------------ ---------- ------ ------------------ ------------
 0  riscv.cpu0         riscv      little riscv.cpu          running
 1  riscv.cpu1         riscv      little riscv.cpu          running
 2  riscv.cpu2         riscv      little riscv.cpu          running
 3* riscv.cpu3         riscv      little riscv.cpu          running

Result of the second iteration:

TargetName         Type       Endian TapName            State
--  ------------------ ---------- ------ ------------------ ------------
 0  riscv.cpu0         riscv      little riscv.cpu          running
 1  riscv.cpu1         riscv      little riscv.cpu          unknown
 2  riscv.cpu2         riscv      little riscv.cpu          unknown
 3* riscv.cpu3         riscv      little riscv.cpu          unknown

Also, if you add additional printing in file riscv-013.c in function riscv013_get_hart_state(), you can see that dmstatus values are different between two runs.

From 8ca3f30a292977f7b94ba057a86fab7459b8f2c8 Mon Sep 17 00:00:00 2001
From: Kirill Radkin <[email protected]>
Date: Mon, 7 Aug 2023 12:57:39 +0300
Subject: [PATCH] addtional printing

---
 src/target/riscv/riscv-013.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/target/riscv/riscv-013.c b/src/target/riscv/riscv-013.c
index 24713a93c..70f905c35 100644
--- a/src/target/riscv/riscv-013.c
+++ b/src/target/riscv/riscv-013.c
@@ -2496,6 +2496,7 @@ static int riscv013_get_hart_state(struct target *target, enum riscv_hart_state
 		return ERROR_FAIL;
 
 	uint32_t dmstatus;
+	LOG_INFO("DMSTATUS == %" PRIx32, dmstatus);
 	if (dmstatus_read(target, &dmstatus, true) != ERROR_OK)
 		return ERROR_FAIL;
 	if (get_field(dmstatus, DM_DMSTATUS_ANYHAVERESET)) {
-- 
2.34.1

If I understand it correctly, this situation is connected to how examine() is implemented in riscv-013.c. At the end of examine() OpenOCD adds all targets to a halt group (because we have smp configuration in this case). It leads to an interesting situation: harts are halted and resumed one by one (as it should be) when examined during the first connection, but when examined during the second connection harts are already merged into a halt group. Because of that when OpenOCD tries to halt the first target, all harts become in a halted state.

The text was updated successfully, but these errors were encountered:

kr-sc · 2023-08-25T11:36:59Z

@timsifive, can you confirm that my reasoning is correct?

JanMatCodasip · 2023-08-29T08:28:24Z

@kr-sc I haven't yet tried to reproduce it locally. What values of dmstatus are you getting, please? Are they unexpected?

Also your added LOG_INFO print in the patch is put before the dmstatus_read() call - can you check if that is maybe a typo?

My expectation is that the examination should pass even if the harts are members a halt/resume group. For the examination, it is important that the given hart being examined can be halted for a brief period of time (and then resumed, if running previously).

If in that process some other hart(s) get halted/resumed also (due to the halt/resume groups), that should not break the examination as such.

kr-sc · 2023-08-30T08:13:46Z

I haven't yet tried to reproduce it locally. What values of dmstatus are you getting, please? Are they unexpected?

I get values that differs only in all/any resumeack bits, maybe it's not important here.

Also your added LOG_INFO print in the patch is put before the dmstatus_read() call - can you check if that is maybe a typo?

Oh, it should be after dmstatus_read(), of course.

My expectation is that the examination should pass even if the harts are members a halt/resume group. For the examination, it is important that the given hart being examined can be halted for a brief period of time (and then resumed, if running previously).

Main problem that when we try to halt (if it's was not halted, for example) only first target during second examine() call, we halt all targets which was added in halt group during previous examiine() call and we lost information about hart's state before second examine.

timsifive · 2023-08-31T00:37:05Z

Can you add -d to your OpenOCD invocations, capture the output, and attach it here?

kr-sc · 2023-09-05T12:04:48Z

1.log
2.log

After second examine targets (1-3) in halted state (not unknown, maybe somebody fix it in new versions), but before first connection all of them were in running state.

kr-sc · 2023-09-05T12:47:52Z

P.S. unknown hart state was fixed here 198edca

timsifive · 2023-09-08T16:45:19Z

The difference seems to be simply that in the first log the harts are running while in the second log they are halted.

@aap-sc do you have time to look at this? It looks related to (or maybe even fix by) #900.

timsifive · 2023-09-08T16:46:16Z

Wait, is @kr-sc's latest comment saying this was fixed by #900? If so, can we just close this?

aap-sc · 2023-09-08T16:56:41Z

@timsifive not quite. What @kr-sc says is that unknown status was fixed. However, the underlying issue is still there.

It's just that instead of unknown status, harts will be halted, instead of unknown. I'll try to produce an updated version of logs. My understanding is that this is related to how halt-groups are handled by OpenOCD during examine: when harts are combined in a halt group - this is an invasive change which is not reverted upon disconnect/re-examine). It's been a while since I took a glance at this one, so my memory may fail me.

Actually, @kr-sc was going to look at this one eventually, however since halt groups are not supported by Syntacore HW this task is not of the highest priority. This issue was filed just to document a problem.

That being said - I'll double check the situation.

kr-sc · 2023-09-18T10:04:37Z

The difference seems to be simply that in the first log the harts are running while in the second log they are halted.

Yes, but in both runs we connect to same simulator session, which targets were in running state before first connection. After second, they become in halted state. I think, that it is incorrect behaviour of OpenOCD, becauese we didn't send any commands during these two connections, but we changed hart state.

timsifive · 2023-09-25T20:43:45Z

To summarize:

When first connecting, all harts are running. After examine() all harts are still running. examine() did put all harts in the same halt group.
When connecting a second time, all harts are also running. However, during examine(), halting the first hart halts all the other harts as well (because they're in the same halt group). After examine(), hart 0 is running (because it was running when examine() first looked at it), but the remaining harts are halted (because they were halted when examine() first looked at them).

I agree that's a bug. Not sure when I'll get to it. One solution might be to clear all halt groups before examining the first hart. (But examine() is called per "target" without any extra context, so it's not as straightforward as it might be.)

kr-sc · 2023-09-26T09:08:22Z

One solution might be to clear all halt groups before examining the first hart.

Halt groups are used to implement OpenOCD smp target, so I think we should clear halt groups before OpenOCD disconnected from target.

aap-sc · 2023-09-26T11:29:56Z

so I think we should clear halt groups before OpenOCD disconnected from target.

I think we should reset halt groups even upon initial connection. Though it makes examine even more invasive :(, however the resulting behavior is more predictable.

kr-sc changed the title ~~Incorrect work of examine()~~ Incorrect work of examine() in case of halt groups Jan 30, 2024

aap-sc mentioned this issue Feb 1, 2024

Support for smp in multiple DMs #907

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect work of examine() in case of halt groups #902

Incorrect work of examine() in case of halt groups #902

kr-sc commented Aug 7, 2023

kr-sc commented Aug 25, 2023 •

edited

Loading

JanMatCodasip commented Aug 29, 2023 •

edited

Loading

kr-sc commented Aug 30, 2023

timsifive commented Aug 31, 2023

kr-sc commented Sep 5, 2023

kr-sc commented Sep 5, 2023 •

edited

Loading

timsifive commented Sep 8, 2023

timsifive commented Sep 8, 2023

aap-sc commented Sep 8, 2023

kr-sc commented Sep 18, 2023 •

edited

Loading

timsifive commented Sep 25, 2023

kr-sc commented Sep 26, 2023 •

edited

Loading

aap-sc commented Sep 26, 2023

Incorrect work of examine() in case of halt groups #902

Incorrect work of examine() in case of halt groups #902

Comments

kr-sc commented Aug 7, 2023

kr-sc commented Aug 25, 2023 • edited Loading

JanMatCodasip commented Aug 29, 2023 • edited Loading

kr-sc commented Aug 30, 2023

timsifive commented Aug 31, 2023

kr-sc commented Sep 5, 2023

kr-sc commented Sep 5, 2023 • edited Loading

timsifive commented Sep 8, 2023

timsifive commented Sep 8, 2023

aap-sc commented Sep 8, 2023

kr-sc commented Sep 18, 2023 • edited Loading

timsifive commented Sep 25, 2023

kr-sc commented Sep 26, 2023 • edited Loading

aap-sc commented Sep 26, 2023

kr-sc commented Aug 25, 2023 •

edited

Loading

JanMatCodasip commented Aug 29, 2023 •

edited

Loading

kr-sc commented Sep 5, 2023 •

edited

Loading

kr-sc commented Sep 18, 2023 •

edited

Loading

kr-sc commented Sep 26, 2023 •

edited

Loading