Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Robo-Tank v6.0 is Ready - Now v6.5
Oops, missed that. :) Thanks for letting me know things went well, slowly getting there.
[-] The following 1 user Likes Rob F's post:
  • Coloredrock
Reply to top
I think someone else mentioned this.. but the install for robo-tank could also just be in a different folder... no need to mess with ports.. Im making an assumption that, robo-tank software is still using reef-pi's API calls.

so... using ip's as an example:

Defaulting to reef-pi
192.168.1.xx
Robo-tank could be 192.168.1.xx/robo-tank

I would love the capability to run both.
Reply to top
(04-11-2022, 08:44 AM)Coloredrock Wrote: I think someone else mentioned this.. but the install for robo-tank could also just be in a different folder... no need to mess with ports.. Im making an assumption that, robo-tank software is still using reef-pi's API calls.

so... using ip's as an example:

Defaulting to reef-pi
192.168.1.xx
Robo-tank could be 192.168.1.xx/robo-tank

I would love the capability to run both.

you can't just change the directory for robo-tank as the port 80 is used by the internal webserver of reef-pi.
You have to change the reef-pi port or change the port used by apache for robo-tank
[-] The following 1 user Likes gandalf's post:
  • Coloredrock
Reply to top
Yeah to run both you'll need to change the port number of one of them, my app isn't using reef-pi API, it runs on it's own that's why they conflict.
[-] The following 1 user Likes Rob F's post:
  • Coloredrock
Reply to top
Hi Rob! I had installed debian bullseye  K05106 so, on your advice, i make a fresh install with debian buster. Installation of v6.5 directly. No problem, it work well. I install probe ph,t° Detected and Ok, ac port and dc port ok. Clock 24h format ok. Good job !
You must have forgotten for disable auto logout. It doesn’t matter. 
I keep testing K05108
Reply to top
Hi Tutuss, that's great, thanks for letting me know. I thought about the auto logout right after I packaged the update so yeah I'll try and have it for the next one. :)
Reply to top
Heya Rob. I was hoping an update would fix this problem i'm having. i think i told you about it in an email a while back.. I add a rule for when PH drops below 8.2 it turns on a dc port. if ph goes above 8.3 it turns that dc port back on. once these are saved i look at the rules and they are showing as "if null is less then 8.2..." i have verified the rules are not actually doing anything either. any suggestions?
[-] The following 1 user Likes SyberSects's post:
  • Rob F
Reply to top
I think I do remember this and just added a rule for pH and get the same, maybe I saw this before and forgot to look into it. I will get it fixed for next release which I've been working on. Thanks for letting me know.
Reply to top
Is anyone seeing duplicate entries in the log added in v6.5?
Reply to top
not duplicated but they was saved wrong because the type id doesn't match the schedule type
Reply to top
Thanks and the logTypeID isn't supposed to match the schedule type.
Reply to top
I don't remember exactly what I've fixed, but previosuly it was logging misleading data (better to not log than logging misleading data).
Probably it wasn't logging the source properly or similiar.
Reply to top
for the second time a full bag of Easy SPS evo was dosed due to bugs in Robo Tank clock

if you don't want to loose the tank, use reef pi, as this robotank is absolutely unreliable in handling the clock.

look at the attached image, there is a huge difference between the current time and the time seen by robotank, because the "clock" (it's a counter, not a clock) hangs and so does the dosing pump that stay on Forever

screenshot


Attached Files Image(s)
   
Reply to top
I'm confused, a couple weeks ago in email you told me you changed how the RTC code worked because you didn't like how I set it up. Sorry it didn't work out for you.
Reply to top
the scheduler is stock. what i've changed is the initial clock read done during controller startup that now is always done from system time, but everything else is stock. what is not working and has huge flwas are your scheduler and your way to count milliseconds in arduino style

until schedules are done in a reliable way, and until you find a way to stop the dosing when the controller is not responding i would suggest to not use robotank. risks to nuke the full tank are too high. there are live animals inside! Animals healthy has to be guaranteed.

you know better than me that this issue has nothing to do with the startup time sync as you don't use the time to stop the scheduler but a simple loop-style milliseconds counter and, also, the controller booted 7 days ago....
Reply to top
To be honest I'm a little surprised about the fear of running on a live tank as it feels like you've been making more changes in the code lately than I have based on our emails and posts. You're also updating manually due to this and if anything is missed it's not good.

The scheduling system is actually well written and quite fail proof as long as the program runs, after running over a year I've never seen the program lock up or a missed schedule but I guess anything is possible that's why it's still beta. The scheduling is using milliseconds but based on actual time, as you mentioned there is no real clock, everything is just a counter, surely you can't use actual time format in code. RTC starts counting at January 1, 1970 12:00:00 AM which is used as a starting point. When the program starts up or a schedule runs or a schedule is added/edited a simple function runs to calculate all schedules next upcoming time in unix timestamp. Then a simple loop is used to trigger the schedule when the current timestamp is greater than the next schedule time stamp, once that passes the schedule has to trigger, it's just logic. Once it triggers it updates the next upcoming schedules and waits again. Very simple system that uses hardly any code. If the program is running it's impossible for it not to run.

The dosing is a separate system from scheduling, the schedule simply starts the dosing process at which point the dosing system is responsible for stopping the pump. If a pump doesn't stop it's not due to scheduling, there are no off schedules for dosing. I think you've mentioned a few times you changed things with dosing so that could be the issue. The run time for dosing is calculated using the dose pump rate * dose amount. When the pump actually starts the system is flagged as it having done so and then a loop is always monitoring this and the counter is compared to stop the pump, this is also a pretty simple system, as long as the program runs the logic has to flow.

I'm confident with the scheduling but I still want to improve the RTC code but it's nothing related to what you're experiencing. If you feel it's RTC related disable that and it'll use NTP instead.

Signs that the program has stopped running would be a red plug next to time, the health meter should drop to 0 and for sure the clock in the browser will stop updating as this is showing what the controller is working on. One thing I've noticed with the Pi and c++ program is it never locks up, instead if something occurs that would cause that it crashes and the program is closed. For my personal use I don't use the Pi service file to start the program, instead I leave my SSH terminal open 24/7 with the program started manually (not really recommended as program stops if SSH terminal is closed). I do this so I can see output from program to verify things do what they should and also to know if the program ever crashes. Honestly I've never seen it crash without reason but it's still possible if something isn't configured correctly, as I find these I'm adding extra validation in front end so it can't happen but that's a process. Things happen in real world that one can never predict. The time it would crash is when a function runs that has bad code or invalid data in a variable. For example if a dosing pump started up and a variable wasn't correct it's very possible the program crashes after it starts the pump and at that point it wouldn't stop it as it's down and out. Maybe something like this is happening. I do have dosing pumps configured and this has never happened but again just wondering if a change you made might be causing it. Editing someone else's code is very hard when you don't know how it all flows, this is a large program and everything connects one way or another so you really need to be in my head to see how it's all laid out.

I certainly don't hold it against you for not using it, I understand and thanks for your feedback.
Reply to top
(04-24-2022, 12:43 AM)Rob F Wrote: To be honest I'm a little surprised about the fear of running on a live tank as it feels like you've been making more changes in the code lately than I have based on our emails and posts. You're also updating manually due to this and if anything is missed it's not good.

Again, I'm NOT using the customized version, as i've already told to you , i've reverted back my changes (primary due to the email library issue).
Only changes currently active are the i2c check skipped for sensor with ID 104 (as you have suggested me), in this way the RTC clock is not seen and everything act as usual
AND a couple of changes in the web admin, obviously the web admin has nothing to do with the controller itself.




based on the system log the culprit is............. the retry feature, added by "someone very excited about that" in v6.5.


Code:
Apr 24 00:10:00 reef-tank startup[854]: RUNNING SCHEDULE #66
Apr 24 00:10:00 reef-tank startup[854]: startDosing------------------------------5
Apr 24 00:10:00 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:00 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:00 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:00 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:00 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:00 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:00 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:01 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:01 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:01 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:01 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:01 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:01 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:01 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:01 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:02 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:02 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:02 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:02 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:02 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:10:02 reef-tank startup[854]: TRY AGAIN                 -

Apr 24 00:56:31 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:56:32 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:56:32 reef-tank startup[854]: TRY AGAIN                 -
Apr 24 00:56:32 reef-tank startup[854]: TRY AGAIN                 -

and so on until i've unplugged the cable (2 hours later, when I was back home and the EasySPS bag was already empty.

Why this happens ? Because there isn't any safety measure, it's an unlimited while, if the PCA didn't get the signal, your retry will retry forever.
And when this happens, even the clock hangs (as per my screenshot) and if the clock hangs, everything else won't work as expected.

A better system would be to retry 2-3 times then force OFF the affected pin, send email, send sms, warnings, call 911, the fire fighters and whatever you want, but SHUTDOWN THE PIN.

Forcing the pump/dc/ac/whatever to OFF or to a safe level is what i'm trying to tell you from weeks but you still add feature (that's good) without increasing reliability to the existing code.

Suggestion 2: when you start the counter for dosing/ac/dc/whatever, save the expected end time somewhere on database and clear the field when finished.
Then with an external script run from crontab, totally separated from the robotank application, check for the expire time and, if still set, force the shutdown of the port. This will shutdown the port even if robotank is crashed.

I've started to make some fixes in the previous weeks but every time you made a new release my fixes are ignored, so the question is: why wasting time if you discard every change?
I can also start to fix this huge flaw, but i know that you still ignore everything when releasing the v6.6 like you did for v6.5 and v6.4

In example, where are my pachtes for adding sub-milliliter dosing, like 15.50ml ? or patches for allowing the usage of PWM on each dosing pump ? A lower rotation rate increase the dosing precision. Or my patches for increasing the dosing time and so on ? No comments so far, no suggestions to my code or for a better implementation in your software. Just silently dropped.

You said the the dosing system is running properly from years, but the fix in v6.4 for the variable overflow that lead to an unlimited dose is mine....

(no, i'm NOT using anything of my patches to c++ code, just the one for the web interface, because using a patched version that need to be totally re-patched on every release is a pain)

anyway, here's a fix for the PCA.
maybe writing always 0 (and not the previous value) is better ? Doing in this way, the controller doesn't get stuck in TRY AGAIN, frozing everything else.

Code:
void pca9685::adjustPinLevel(short pinID, short value) {  // adjust pca9685 pin level
    std::cout << "adjustPinLevel  - " << pinID << " - " << value << std::endl;
    short pinLoc = 0;
    for(short a=0; a < pca9685TotalPins; a++ )  // search all channels for matching pinID
    {
        if (pca9685PinID[a] == pinID) {pinLoc = a; break;}  // record array location
    }

    short driverLoc = 0;
    for(short a=0; a < pca9685totalEnabled; a++ )  // search all channels for matching i2c address so we know which driverID is required
    {
        if (pca9685Address[a] == pca9685AddressSaved[pinLoc]) {driverLoc = a; break;}  // record array location
    }

    short a = 0;
    char buffer[5];
    int length = 5;
    int cnt = 0; // Counter for max retry
    int originalValue = getPinLevel(pinID);

    buffer[0] = PCA9685_CHANNEL_0 + (pca9685PinNumber[pinLoc] << 2);
    buffer[1] = a & 0xFF;
    buffer[2] = (a >> 8) & 0x1F;
    buffer[3] = value & 0xFF;
    buffer[4] = (value >> 8) & 0x1F;
    write(pca9685I2Cfile[driverLoc], buffer, length); // initiate write
    delay(5);

    // verify GPIO pin was updated
    bool f = 0;  // flag to exit while loop
    while (f == 0 && cnt < 3)  // check if pca9685 received the signal
    {
        if (getPinLevel(pinID) == value) {f = 1;}  // pca9685 was updated correctly
        else {  // pca9685 wasn't updated, try again
            std::cout << "TRY AGAIN - new value: " << value << ", original value: " << originalValue << std::endl;
                 logError(3, to_string(pca9685AddressSaved[pinLoc]), "PCA9685 did not update");
            delay(50);
            f = 0;
            buffer[0] = PCA9685_CHANNEL_0 + (pca9685PinNumber[pinLoc] << 2);
            buffer[1] = a & 0xFF;
            buffer[2] = (a >> 8) & 0x1F;
            buffer[3] = value & 0xFF;
            buffer[4] = (value >> 8) & 0x1F;
            write(pca9685I2Cfile[driverLoc], buffer, length); // initiate write
            delay(50);

            cnt++;
        }
    }

    // IF f IS 1, UPDATE WAS OK, IF 0, THEN WE ARE HERE AFTER 3 UNSUCCESFULL RETRIES, THEN WE FORCE BACK THE VALUE
    if ( f != 1 ) {
        std::cout << "TRY AGAIN EXPIRED FORCE PREVIOUS VALUE. original value: " << originalValue << std::endl;
        logError(3, to_string(pca9685AddressSaved[pinLoc]), "PCA9685 did not update");
            delay(50);
            f = 0;
            buffer[0] = PCA9685_CHANNEL_0 + (pca9685PinNumber[pinLoc] << 2);
            buffer[1] = a & 0xFF;
            buffer[2] = (a >> 8) & 0x1F;
            buffer[3] = value & 0xFF;
            buffer[4] = (originalValue >> 8) & 0x1F;
            write(pca9685I2Cfile[driverLoc], buffer, length); // initiate write
            delay(50);
    }    
}
Reply to top
I am sorry this happened, definitely not what I want to hear but now I think I understand. As for your changes I haven’t ignored them, I appreciate all feedback. I thought I mentioned I wasn’t going to add the dosing pump changes as I was going to redo it all which I’m currently in the process of. I’m adding your suggestions such as schedules dictate how much is dosed etc. I’m also always improving existing code and adding fail safes, this is why some updates don’t look impressive but changes and improvements are always being done. This is why I added the pca9685 verification and logging, to improve reliability, I felt that was more important then dosing features at the time. Unfortunately I’m also limited on time I can spend on this, that’s why I haven’t started a thread yet on future features as there’s still to many other things that need addressing and it would feel like I’m ignoring them.

(04-24-2022, 03:17 AM)gandalf Wrote: You said the the dosing system is running properly from years, but the fix in v6.4 for the variable overflow that lead to an unlimited dose is mine....

As for that variable size not being large enough, that was only an issue if well the calculated number was too large. In all my testing I never thought to have such a large dose or whatever caused the large number so I never came across the issue so it wasn’t one, it’s impossible for me to configure test setups for every possible scenario. That’s why I need beta testers, as you’re a software developer you should understand that it takes time and small details can be overlooked sometimes. I thank-you for discovering the issue and now we need to find the next cause there will be one. This is why it’s important to test everything after adding a schedule, custom rule or whatever to verify it functions as expected.

Putting aside that this caused you harm, the pca9685 verification did exactly what I hoped and sniffed out an issue. Your correct that as is in 6.5 it’s possible to run forever which would be the same effect as the program freezing and shortly after I released it I shook my head that I didn’t add an escape method, this is something I’m always aware of and make sure it can’t happen, I guess I rushed it out without realizing it but it’s there now. The same idea as yours, just count the retries and exit if too many.

Your fix won’t do anything though, the reason it went in an infinite loop is because it lost all communications with the pca9685 so aborting after 3 tries and then essentially trying a 4th time to set pin off will do nothing as it can’t communicate, may as well just make the loop 4 tries and abort. One of the beauties of the verification is we know every time this happens, I can foresee it happening rarely but only to the point where the 2nd maybe 3rd command sent is valid. If it loses all communication or happens often something needs addressing with the hardware. If the I2C bus hangs, which sounds like it is, the only fix is a reset and that’s easy now and something I’ll add. After it tries say 10 times it will then create a log entry and alert saying the controller was restarted due to pca9685 timeout and then reboot the controller. When the Pi restarts the pca9685 will be reinitialized and all outputs set to off. This might be the safest and should always get the pca9685 reset within 20-30 seconds (Pi boot time) of failure before it sorts itself out. As it’ll be clearly logged plus alert one will know when it happens and the problem can be looked into and fixed so the validation is invaluable, without it one is guessing, now we know so I double down on being very excited about this feature and even more so. If this was in place a week ago with a reboot I think what happened would have been avoided minus the extra dosing due to long start up time and we still would have known it happened. This is something I hope to improve later. Currently the service file doesn’t start the program until a network connection is verified as its needed for NTP time. Now that an RTC is available I don’t see why it needs to wait, hoping to have the program up and running in under 5 seconds at some point.

So yeah now I think the problems you’ve had with dosing was never related to code minus the variable oversight, it’s always done what it should. As you’ve never had issues with doses starting or other things that means scheduling system works (logging will help verify) and now that the pca9685 verification has told us the I2C bus locked we know it’s hardware. No software is able to shut of a pin if it can’t communicate.

Looking at your order I see you have an external pca9685, do you only have that connected or other I2C devices as well? How long are the cables going to the external device/s? My gut thinks it may be that causing instability on the I2C bus, it doesn’t take much. This is why for your kH tester I didn’t like to hear I2C, it can cause a bad day and ultimately I do need to lose it for expansions like the pca9685.
Reply to top
i don't have anything connected. Just the plain V2 controller.
i've bought one expander in the first order but is still in the bag.

a controller reboot would be ok, a little bit over kill but better than nothing

maybe an i2c hard reset from Raspberry would be enough?

(starting from this night freeze, the pH controller is disappeared. obviously i've already rebooted the Pi)

i'm not sure that my fix won't work, because the i2c bus worked with no issue at all from the day 1, i've never had any single issue with the bus, then, immediatly after updating to v6.5 (3-4 days ago, i dose easy sps every 3 days), the first dosing (with the new retry) froze everything. A little bit strange for a coincidance. Also, the write succeeded (dosing pump started properly) and the read, done with the next line in the code after a 5ms delay, failed ? Maybe a too small delay or similiar ?
Reply to top
It's possible the 5ms delay is too quick but I haven't experienced any issues. I agree a coincidence seems odd if this happened immediately after updating. Are you able to reproduce the issue?

I forgot to mention, those 2 extensions you just ordered will plug into the DB9's however the DC ports won't be speed controllable like the DC ports on controller as these will be controlled via Pi GPIO's so you can only turn the ports on/off. I know you like to adjust speed so wanted to verify these can't do it.
Reply to top


Possibly Related Threads…
Thread Author Replies Views Last Post
  Robo-Tank v6.6 is Ready Rob F 82 24,712 04-25-2024, 09:03 PM
Last Post: Rob F
  Reef-pi Hardware Now Available Rob F 20 14,644 06-18-2020, 06:33 PM
Last Post: Rob F

Forum Jump:

Current time: 04-27-2024, 10:34 AM