Question: Array Assignment
-
This was after a run which had multiple losses of WiFi connectivity, which has been happening a lot lately, but afterward it still showed as Homed, so it must not have rebooted.
5/15/2024, 12:08:43 PM M122
=== Diagnostics ===
RepRapFirmware for Duet 2 WiFi/Ethernet version 3.5.1 (2024-04-19 14:40:46) running on Duet WiFi 1.02 or later + DueX5
Board ID: 08DGM-9T6BU-FG3SN-6JKD0-3S06Q-9AY7D
Used output buffers: 12 of 26 (26 max)
=== RTOS ===
Static ram: 23256
Dynamic ram: 80072 of which 224 recycled
Never used RAM 952, free system stack 116 words
Tasks: NETWORK(1,ready,15.1%,215) HEAT(3,nWait 5,0.1%,328) Move(4,nWait 5,0.0%,298) DUEX(5,nWait 5,0.0%,19) MAIN(1,running,84.0%,703) IDLE(0,ready,0.8%,29), total 100.0%
Owned mutexes:
=== Platform ===
Last reset 00:53:40 ago, cause: software
Last software reset at 2024-05-15 11:15, reason: HardFault bfarValid precise, Gcodes spinning, available RAM 824, slot 0
Software reset code 0x4063 HFSR 0x40000000 CFSR 0x00008200 ICSR 0x00417803 BFAR 0x6d754e40 SP 0x20002568 Task MAIN Freestk 1151 ok
Stack: 6d754e44 6d754e40 200027a3 00000017 ffffffff 0045d6a7 004663aa 210e0000 00410f21 20002784 00000017 20002684 20002784 00000002 00000000 00000001 00411491 00000000 20002680 20002688 00000001 00000000 00000001 a5a5a5a5 00000000 454c000c 6d754e44
Error status: 0x0c
Aux0 errors 0,0,0
MCU temperature: min 43.0, current 46.3, max 46.4
Supply voltage: min 24.0, current 24.2, max 24.4, under voltage events: 0, over voltage events: 0, power good: yes
Heap OK, handles allocated/used 297/68, heap memory allocated/used/recyclable 6144/2808/840, gc cycles 264
Events: 0 queued, 0 completed
Driver 0: standstill, SG min 0
Driver 1: standstill, SG min 0
Driver 2: standstill, SG min n/a
Driver 3: standstill, SG min n/a
Driver 4: standstill, SG min n/a
Driver 5: standstill, SG min 0
Driver 6: standstill, SG min 0
Driver 7: standstill, SG min 0
Driver 8: standstill, SG min n/a
Driver 9: standstill, SG min n/a
Driver 10:
Driver 11:
Date/time: 2024-05-15 12:08:42
Cache data hit count 4294967295
Slowest loop: 20.69ms; fastest: 0.10ms
I2C nak errors 0, send timeouts 0, receive timeouts 0, finishTimeouts 0, resets 0
=== Storage ===
Free file entries: 9
SD card 0 detected, interface speed: 20.0MBytes/sec
SD card longest read time 15.4ms, write time 0.0ms, max retries 0
=== Move ===
DMs created 83, segments created 3, maxWait 2785753ms, bed compensation in use: none, height map offset 0.000, max steps late 0, min interval 0, bad calcs 0, ebfmin 0.00, ebfmax 0.00
no step interrupt scheduled
Moves shaped first try 0, on retry 0, too short 0, wrong shape 0, maybepossible 0
=== DDARing 0 ===
Scheduled moves 109, completed 109, hiccups 0, stepErrors 0, LaErrors 0, Underruns [0, 0, 4], CDDA state -1
=== Heat ===
Bed heaters 0 -1 -1 -1, chamber heaters -1 -1 -1 -1, ordering errs 0
Heater 1 is on, I-accum = 0.0
=== GCodes ===
Movement locks held by null
HTTP is idle in state(s) 0
Telnet is idle in state(s) 0
File is idle in state(s) 0
USB is idle in state(s) 0
Aux is idle in state(s) 0
Trigger is idle in state(s) 0
Queue is idle in state(s) 0
LCD is idle in state(s) 0
Daemon is doing "G4 P{global.DaemonPeriod}" in state(s) 0 0, running macro
Autopause is idle in state(s) 0
Q0 segments left 0
Code queue 0 is empty
=== Filament sensors ===
check 0 clear 25655011
Extruder 0 sensor: no data received
Extruder 1 sensor: no data received
=== DueX ===
Read count 0, 0.00 reads/min
=== Network ===
Slowest loop: 21.15ms; fastest: 0.07ms
Responder states: HTTP(0) HTTP(0) HTTP(0) FTP(0) Telnet(0)
HTTP sessions: 1 of 8
=== WiFi ===
Interface state: active
Module is connected to access point
Failed messages: pending 0, notrdy 0, noresp 0
Firmware version 2.1.0
MAC address 84:f3:eb:83:47:be
Module reset reason: Power up, Vcc 3.36, flash size 4194304, free heap 42792
WiFi IP address 192.168.1.130
Signal strength -41dBm, channel 11, mode 802.11n, reconnections 0
Clock register 00002002
Socket states: 0 0 0 0 0 0 0 0 -
@dc42 More information:
(EDIT: I SPOKE TOO SOON. Still getting array corruption, or at least the "expected numeric operand" even without nan param)
(EDIT: Confirmed that array corruption is still happening.)
I've been doing some troubleshooting and discovered the macro is sometimes receiving "nan" for the S and/or N parameter(s). This shouldn't happen and was due to an uninitialized global value used to calculate those parameters by the calling macro. So far, it seems like correcting that error prevented nan in the parameters and eliminated the array corruption.
It's possible this should have resulted in RRF reporting an error somewhere, but it seems like it should have done that before corrupting array contents, though I haven't analyzed how nan would affect this macro. It also seems like it should probably either have caused an error or (if there was nothing syntactically or semantically illegal) not, rather than only causing one occasionally, although again, it's possible there's a logical reason for this. Maybe the use of nan in certain contexts should just generate an error that you can't do that, if it's letting you do it now.
I'm also still interested in whether my assumptions about array initialization and assignment are correct, and how the nuts and bolts work.
-
@DonStauffer how should I initialise global.NeiPixColor ?
-
var SENT = 0 var PEND = 1 global NeoPixLEDCount = 28 global NeoPixColor = vector(2, vector(global.NeoPixLEDCount, vector(3, 0))) set global.NeoPixColor[var.PEND] = vector(global.NeoPixLEDCount, null)
NeoPixColor[0] holds the color already transmitted to each LED, to avoid resending data unnecessarily. NeoPixColor[1] collects updated colors for each LED, to send them all at once for efficiency. The elements are initialized to null, meaning "there is nothing to send".
I2C is a slow way to update NeoPixels. This is an attempt to eliminate some communication overhead. So AddNeoPix sums the colors sent to it for each LED, then there's a SendNeoPix macro which sends the LED colors from the "pending" part of the array, if they exist (are not null) and are not the same as the "sent" part of the array (NeoPixColor[0]). The colors get added to the pending part of the array rather than replacing existing values there, to aid in color blending, for displaying overlapping ranges of LEDs in different colors, with the middle a blend of the 2 colors.
The problem probably happens less than 1% of calls to AddNeoPix. Running bed leveling probably calls it a couple hundred times, and about half of bed levelings will produce the error. Most recent (cleaned up) code & log; it appears as though the assignment of one element of the RGB array corrupts another element (happens randomly; not always element 0 corrupting 1, or even an earlier element corrupting a later one). Not "line 56" in the error refers to what is line 21 in the snippet.
; Add to pending color values, to be sent later var LEDNum = null var Pend = null while iterations < var.Count set var.LEDNum = var.Start + iterations set var.Pend = global.NeoPixColor[var.PEND][var.LEDNum] ; Check for null pending RGB array if var.Pend == null set var.Pend = {0, 0, 0} ; Add new color value to pending echo "LED#",var.LEDNum,": Pend1=",var.Pend, "Color=",var.Color while iterations < #var.Color echo "Pend[i]=",var.Pend[iterations], "Color[i]=",var.Color[iterations] set var.Pend[iterations] = var.Pend[iterations] + var.Color[iterations] echo "Pend2=",var.Pend set global.NeoPixColor[var.PEND][var.LEDNum] = var.Pend
Log
LED 22 : Pend1= {0,0,0} Color= {0,32,16}
Pend[i]= 0 Color[i]= 0
Pend[i]= 0 Color[i]= 32
Pend[i]= 0 Color[i]= 16
Pend2= {0,32,16}
Exiting AddNeoPix
AddNeoPix C {0,0,0} S 23 N 5
LED# 23 : Pend1= {0,0,0} Color= {0,0,0}
Pend[i]= 0 Color[i]= 0
Pend[i]= null Color[i]= 0
Error: in file macro line 56 column 80: meta command: expected numeric operands -
@dc42 Workaround, tested a dozen runs with no errors; avoids subscripted assignment of RGB array elements by using 3 discrete variables to build an array to assign in its entirety:
; Add to pending color values, to be sent later var LEDNum = null var Pend = null var r = null var g = null var b = null while iterations < var.Count set var.LEDNum = var.Start + iterations set var.Pend = global.NeoPixColor[var.PEND][var.LEDNum] ; Check for null pending RGB array if var.Pend == null set var.Pend = {0, 0, 0} ; Add new color value to pending set var.r = var.Pend[0] + var.Color[0] set var.g = var.Pend[1] + var.Color[1] set var.b = var.Pend[2] + var.Color[2] set var.Pend = {{var.r},{var.g},{var.b}} set global.NeoPixColor[var.PEND][var.LEDNum] = var.Pend
-
@DonStauffer I believe I have identified an issue when using 'set' to assign an element in an array of arrays, where the nested array whose element is being assigned is a copy of another array. I have created this issue https://github.com/Duet3D/RepRapFirmware/issues/1008.
Using 'set' with only a single array index should always be safe. For example, I think the following may be unsafe if global.myarray[1] is a copy of another array:
set global.myarray[1][2] = 1
whereas the following would be safe:
var tempArray = global.myArray[1] set var.tempArray[2] = 1 set global.myArray[1] = var.tempArray
-
@dc42 Off the top of my head, that sounds like exactly the workaround I tried which didn't work. What ended up working as a workaround was avoiding subscripted assignment to the subarray, but instead constructing the array from discrete variables like
set global.myArrayrray[1] = {var.A, var.B, var.C}
I'll look back through my notes and see if I'm remembering right.EDIT: Ah, I sort of "half" did it. I was using an array with an array element with an array element, so the original workaround attempt did start with two subscripts. Maybe that was why it didn't work. Anyhow, my workaround reconstructing the subarray "manually" without subscripts did work, so I'm not stuck any more.
Intermittent problems are tough to trouble shoot. BTW, error messages often don't name the macro they're referring to even when they give line numbers. If it's a complicated tree of nested M98 calls, it only names the outermost one and I have to go searching through the branches. The name of the macro the offending line is actually in would really help.
-
@DonStauffer said in Question: Array Assignment:
Intermittent problems are tough to trouble shoot. BTW, error messages often don't name the macro they're referring to even when they give line numbers. If it's a complicated tree of nested M98 calls, it only names the outermost one and I have to go searching through the branches. The name of the macro the offending line is actually in would really help.
It’s been requested before, but can be quite a memory intensive process. See https://github.com/Duet3D/RepRapFirmware/issues/771
Ian
-
@droftarts Option or command to dump call stack maybe?
-
@DonStauffer said in Question: Array Assignment:
@droftarts Option or command to dump call stack maybe?
Reply
You can always create that yourself. Have a global array that you push and pop the name of the macro to and print it out if when you need to. Or just echo the name of the macro at the start may be enough.
-
@gloomyandy Kind of what I'm doing now. I put something like "echo "MyMacroName" at the beginning of each macro, and "Leaving MyMacroName" as the last line. That clutters the log, but I'm thinking maybe something like:
In config.g:
global CallStack = vector(<nice large number>, "") global CallStackDepth = 0
Then, presumably at the outermost level of whatever I'm running:
set global.CallStackDepth = 0
Then in my macro:
set global.CallStack[global.CallStackDepth] = "MyMacroName" set global.CallStackDepth = global.CallStackDepth + 1 . . . set global.CallStackDepth = global.CallStackDepth - 1
When I needed it, I could run a DumpCallStack macro:
while iterations < global.CallStackDepth echo global.CallStack[iterations]
Maintaining the global.CallStackDepth = 0 could be bothersome when running a macro separately from a stack of macros, but at least it's only one simple line. But a bigger problem would be that an error may abort only the macro it's in, and the calling macro then continues. So CallStackDepth will be left too high in that case. Besides that functional problem, the stack would get more added to it after the error, leaving me unable to tell where the error was - the whole point of the stack trace. If there were a way to know about the error and do something when it occurs, this could be solved, but I don't know of a way. I'm not sure when execution continues vs. the entire call stack aborts (in which case this scheme works). I just assumed it was whether execution could continue, based on the severity of the error.
I wonder how much processing overhead this would take as compared with a compiled-in RRF solution.
-
@DonStauffer The thing is it takes zero amount of memory and adds zero overhead for the many folks who do not need it. That would not be the case for a firmware based solution. Plus you get to use it now rather than waiting for anyone else to implement it and you can tailor it to your needs.
I suspect just echoing the name of the function at the start of a macro is enough for most situations.
-
@gloomyandy That's what I'm doing now, but it's got some problems. Iterative calls add sometimes thousands of lines to the console log, and I have to keep adding and removing these lines of code as I go back and edit macros. I've gotten so I just comment them out and leave them there for the future. But the log clutter is problematic.
-
@gloomyandy I suspect the amount of overhead for a compiled-in version would be very minimal. I don't know what's under the hood in RRF, but if there's a correlation between macros and subroutines, the compiler may already have debugging info stored anyway, and it would just be a matter of having a command to dump that to the log. But I'm speculating.
-
@DonStauffer None of the compiler generated debug information is in the final binary installed on the control board there is nowhere near enough flash space for it. The pressure on RAM and (especially) flash space is already very high on the Duet2 boards so adding anything new requires a lot of thought and consideration.
-
@gloomyandy Makes sense. I imagine it's a real programming challenge, and we users appreciate it! Sometimes I ask about features and I say "I wonder if" because I don't know how it's all structured under the hood. Who knows, a valuable feature that seems difficult might end up being trivial (or vice versa).
Thanks!
-
@dc42 Question:
What does & doesn't "count" as "a copy of another array"? Is it pretty much any way of setting the array's values, even initialization, where the right side isn't a scalar? Or is it just a copy from a named array variable which can trigger this problem? For instance, could any of these lines trigger the problem later?
var A1= vector(2, vector(2, 0)) ; Presumably, array vector creates is copied to A1? set var.A1[0] = {1, 2} ; Presumably, array constant gets copied to A1[0]? set var.A1[0][0] = vector(2, 0) ; Presumably, array vector creates is copied to A1[0][0]?
If I'm understanding correctly, if I want an array with more dimensions than 4 or 5 and therefore can't subscript elements directly, there's no possible way to assign element values without making a copy. (see code)
; So, given: var Inner = vector(2, vector(2, vector(2, vector(2, 0)))) var Outer= vector(2, vector(2, vector(2, vector(2, var.Inner)))) ... ; Instead of set var.Outer[0][0][0][0][0][0][0][0] = 5 ; 8 subscripts ; or var A3 = var.Outer[0][0][0][0] set var.A3[0][0][0][0] = 5 ; Dangerous, set using multi-subscript on copy set var.Outer[0][0][0][0] = var.A3 ; (OK because Outer isn't a copy - or is it?) ; I'd have to do something like: var Inner = vector(2, vector(2, vector(2, vector(2, 0)))) var Outer= vector(2, vector(2, vector(2, vector(2, var.Inner)))) ; Presumably, Outer[n][n][n][n] elements all contain copies of Inner ... var A3 = var.Outer[0][0][0][0] ; OK, assigning to A3, not to element of a copy var A4 = var.A3[0] ; Now A4 contains a copy of var.Outer[0][0][0][0][0] (would be 5 subscripts) var A5 = var.A4[0] ; Now A6 contains a copy of var.Outer[0][0][0][0][0][0] (would be 6 subscripts var A6 = var.A5[0] ; Now A6 contains a copy of var.Outer[0][0][0][0][0][0][0] (would be 7 subscripts) set var.A6[0] = 5 ; OK, only one subscript set var.A5[0] = var.A6; OK, only one subscript set var.A4[0] = var.A5; OK, only one subscript set var.A3[0] = var.A4; OK, only one subscript set var.Outer[0][0][0][0] = var.A3 ; OK? Or is Outer a copy of vector construct? ; ^ Important question above ^
-
@DonStauffer The problem is triggered by any assignment that includes more than one subscript, it's that simple. So in your first example above line 3 may cause a problem, the others are probably ok.
I'm not really sure where you are going with all of this stuff to try and get more dimensions than the maximum. But you need to be aware that you can not write to a copy of the array and use that to modify the original. so for instance the moment that you do set var.A6[0] = 5 it will break the relationship between A6 and outer (by making a fresh copy of the values that you can access via A6[<n>] and will set the entry A6[0] to 5, it will not set Outer[0][0][0][0][0][0][0] to 5. If you understand the concept of "copy on write" basically all of the parts of an array are like that. Which in effect means that although you can probably construct an array with more than the max number of dimensions it will be in effect read only and even then you will need to jump through hoops to be able to read it.
You should probably try the build that DC42 has made available to you in this thread: https://forum.duet3d.com/topic/35769/error-meta-command-too-many-indices/14
It contains the fix that we have come up with for this problem.
-
@gloomyandy Oh and as to that final
set var.Outer[0][0][0][0] = var.A3
It may be ok but it may not, if any of the existing component arrays that make up Var.outer have another variable that happens to reference them then you may be in trouble. So for instance if earlier in the program you had var aa = var.Outer[0][0] or something similar then that could trigger the problem.
Out of interest why do you need so many indices?
-
@DonStauffer I didn't realize he had the build up already. That's fast work!
Where I'm going is simple: I don't want the global namespace cluttered, so I put all the global data I need for a project into a single global variable name. Were I to consider changing my project for this purpose right now, I'd probably wait, but I had already gone to the trouble of making all the changes when I discovered these unexpected challenges.
Sure, delayed writing is a performance trick. But I didn't expect to be able to take advantage of any link to the original. I'm not sure why it would be read only. That's why my code steps back through each subscript and assigns to the previous copy, until it gets back to the original array. It seems like that should work. Not efficient, but it's supposed to be a workaround.
Given a deep array, first, assign an element of the first dimension to a variable. Whether it's linked under the hood at this point doesn't matter. Now, assign an element of the first dimension of THAT variable to a second variable. Continue, until you are at the penultimate dimension. That variable gets the intended value assigned. Then, assign that to the same subscript of the prior variable that it came from. Then assign THAT variable to the same subscript of the prior variable that it came from. Etc., until you assign the first variable to the same subscript of the original array it came from. No link is needed to the original. Not an easy way to assign, but it seems like it makes it possible. I'm copying a branch node by node then reassembling the branch after assigning to the disassembled leaf node. Without the subscript bug, I'd only need to do this at level 4 or 5 without percolating all the way through the depth of the array. Much simpler.