Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UART Performance issues with 2.0.2 version #6689

Closed
handmade0octopus opened this issue May 4, 2022 · 45 comments
Closed

UART Performance issues with 2.0.2 version #6689

handmade0octopus opened this issue May 4, 2022 · 45 comments
Assignees
Labels
Area: Performance Issue related to performance problems and improvements Status: Test needed Issue needs testing
Milestone

Comments

@handmade0octopus
Copy link

Board

Not board specific

Device Description

Not board specific

Hardware Configuration

Not board specific

Version

v2.0.2

IDE Name

PlatformIO 4.2.0

Operating System

Windows 10

Flash frequency

80MHz

PSRAM enabled

no

Upload speed

921600

Description

I have been using Platformio 3.5.0 without any issues for very long time.

I tried to moving on to version 4.2.0 based off 2.0.2 but it seems it has lots of underlying problems or it requires me to rewrite most of my code.

I can't share all of it as its quite advanced, instead I will provide you with few insights and code that had dramatic performance issues. I run all code on mostly tasks with priority 4-8 and less important ones being on 3 and 2.

Top is version 3.5.0 of Platformio (I don't know exactly which Arduino version it's based off - sorry).
Bottom is version 4.2.0
Arduino Platformio 3.5.0 (top) to 4.2.0 (bottom) comparision

Note - I left device to run for 5 minutes to make sure test is fair.

Displays are as follows (starting from left top corner):

  • Log speed - is .CSV logging appending the file with ~30 parameters, speed is in rows/second - as you can see on 3.5.0 it was stable and flat and it's jittery on new version (but it does keep up.

  • Max heap - I wouldn't worry about this one, but we can see change in max heap and it's much smaller than on previous version.

  • RPS - it's "responses per second" - basically I use single wire serial connection (LIN type) to gather data from old typ ECU, here my performance is most important and as you can see it doesn't keep up with it. Maybe my code here is wrong as it's very complicated to provide great stability, but I have been using this one for few years and never has issues before.

It's worth mentioning I am using GPS at 5Hz data rate constantly sending data to my second Serial input.

  • Free heap - here difference is huge and I can't believe it's only due to change from 3.5.0 to 4.2.0 - but I can't find any other culprit!

But the worst offender of all is the part of sketch below that I use to transfer data from/to SD card. Below is snippet of function I use to get those performance stats:

Before I managed to get constantly around 71 ms:

[E][Gauge.S.cpp:1536] handleCopyFile(): Time to read /displayConfig.json -     73 ms

But now it's not even funny:

[E][Gauge.S.cpp:1536] handleCopyFile(): Time to read /displayConfig.json -   1466 ms

I think SD card handling may have changed so some sort of guide of how to move to new type of handling files would be appreciated.

Sketch

#define SD_CS 15
#define SD_SCK 14
#define SD_MISO 27
#define SD_MOSI 13
SPIClass SDSPI(HSPI);

// In setup()
void setup() {
	SDSPI.begin(SD_SCK, SD_MISO, SD_MOSI, SD_CS);
	SD.begin(SD_CS, SDSPI, 80000000);
}

// Yes I could have written it better
// It basically transfers files from SD card to flash and vice versa with usage of temp.dat file in case of power failure.

#define COPY_BUFFER 512
#define TEMP_NAME "/temp.dat"
boolean handleCopyFile(const char* nameSource, const char* nameTarget, boolean overwrite = true, boolean toSDCard = false, boolean removeSource = false) {
	#ifdef DEBUG
		uint32_t fileWriteStamp = millis();
	#endif
	uint8_t bufFile[COPY_BUFFER];
	boolean sourceExist;
	boolean targetExist;
	if(toSDCard) {
		sourceExist = SPIFFS.exists(nameSource);
		targetExist = SD_USE.exists(nameTarget);
	} else {
		sourceExist = SD_USE.exists(nameSource);
		targetExist = SPIFFS.exists(nameTarget);
	}
	if(!overwrite) overwrite = !targetExist;
	if(sdReady && sourceExist && overwrite) {
		File sourceFile;
		if(toSDCard) sourceFile = SPIFFS.open(nameSource, "r");
		else sourceFile = SD_USE.open(nameSource, "r");
		if(!sourceFile) return false;
		boolean renamed = false;
		if(targetExist) {
			if(toSDCard) {
				// SD_USE.remove(nameTarget);
			} else {
				renamed = SPIFFS.rename(nameTarget, TEMP_NAME);
			}
		}
		
		File targetFile;
		if(toSDCard) targetFile = SD_USE.open(nameTarget, "w");
		else targetFile = SPIFFS.open(nameTarget, "w");
		if(!targetFile) {
			sourceFile.close();
			if(renamed) SPIFFS.rename(TEMP_NAME, nameTarget);
			return false;
		}
		
		size_t bytesRead = COPY_BUFFER;
		while(bytesRead = sourceFile.read(bufFile, COPY_BUFFER)) {
			targetFile.write(bufFile, bytesRead);
			#ifdef DEBUG
		//	Serial.write(bufFile, bytesRead);
			#endif
		}
		targetFile.flush();
		sourceFile.close();
		targetFile.close();

		if(renamed) SPIFFS.remove(TEMP_NAME);

		#ifdef DEBUG
		log_e("Time to read %s - %6d ms", nameSource, millis() - fileWriteStamp);
		#endif

		if(removeSource) {
			if(toSDCard) {
				SPIFFS.remove(nameSource);
			} else {
				SD_USE.remove(nameSource);
			}
		}
		delay(20);
		return true;
	} else if(targetExist) return true;
	return false;
}

Debug Message

Before on 3.5.0:

[E][Gauge.S.cpp:1536] handleCopyFile(): Time to read /displayConfig.json -     73 ms

Now on 4.2.0:

[E][Gauge.S.cpp:1536] handleCopyFile(): Time to read /displayConfig.json -   1466 ms


### Other Steps to Reproduce

_No response_

### I have checked existing issues, online documentation and the Troubleshooting Guide

- [X] I confirm I have checked existing issues, online documentation and Troubleshooting guide.
@handmade0octopus handmade0octopus added the Status: Awaiting triage Issue is waiting for triage label May 4, 2022
@VojtechBartoska
Copy link
Collaborator

VojtechBartoska commented May 4, 2022

Hi, there were issues with SD card in 2.0.2.

Please retest this with 2.0.3 which was just released. SD performance issues are fixed there.

Thanks

In general, there are not any performance issue in other parts you mentioned above (as far as we know) and 2.0.2 is out for quite some time.

@VojtechBartoska VojtechBartoska added Resolution: Awaiting response Waiting for response of author and removed Status: Awaiting triage Issue is waiting for triage labels May 4, 2022
@handmade0octopus
Copy link
Author

In general, there are not any performance issue in other parts you mentioned above (as far as we know) and 2.0.2 is out for quite some time.
Will do once they release on PlatformIO.

You haven't changed how Serial is handled for 2.0? Any other things I can look for for performance bottlenecks?

@SuGlider
Copy link
Collaborator

SuGlider commented May 4, 2022

Main difference from Arduino Core 1.0.6 (used in PIO 3.x) to Core 2.0.2 (used in PIO 4.x) is that most Peripherals were refactored to use IDF instead of reading/writing registers directly.

Serial is based on IDF now, but I don't think it is a problem for performance.
Everything is IRQ driven and there may be a overhead of maybe 500 CPU cycles on it, if compared to 1.0.6.

@mrengineer7777
Copy link
Collaborator

mrengineer7777 commented May 4, 2022

PIO is still pointing to arduino-esp32 v1.0.6 (very old). Jumping to 2.0.2 is a huge leap, and yes there were issues with SD card that were solved in 2.0.3. Serial did change but no known issues. Try 2.0.3.

@SuGlider
Copy link
Collaborator

SuGlider commented May 4, 2022

@handmade0octopus
I guess you use Serial to read a GPS, is it right?
We provide a new API HardwareSerial::onReceive(callbackFunction) that may help your application to get better performance.
It will callback a fuction as soon as data arrives to UART, like an ISR...

@handmade0octopus
Copy link
Author

Thanks @SuGlider, any idea about increased heap usage? Do you know what sort of things changed in this regards? Does Arduino 2.0 uses any extra xTaskCreate for its internal use that I may stumble upon?

@SuGlider
Copy link
Collaborator

SuGlider commented May 4, 2022

Regarding Heap, we have changed a few configurations for WiFi in sdkconfig file.
There was an issue with Heap left with WiFi when Arduino Core 2.0.0 was compared to 1.0.6.
Not sure if this is your case... Also not sure if PIO has those changes as well...

Look at #5474 (comment) to see a comparation:
PR #5791

Arduino v1.0.6 (v3.3.5-1-g85c43024c) ==> Free memory: 213,900 bytes || Binary Size 692,190 bytes
Arduino v2.0.0 (v4.4-dev-2313-gc69f0ec32 / sha.44c1198) ==> Free memory: 142,228 bytes || Binary Size 734,576 bytes
Arduino v2.0.1 (v4.4-dev-3401-gb86fe0c66c / sha.f459b10) ==>Free memory: 216,140 bytes || Binary Size 645,381 bytes

Comparing V2.0.0 with V2.0.1, this sketch has about 74K extra heap and takes about 90K less flash with the compiled binary.
The numbers are now likely better than it was before with 1.0.6.

@SuGlider
Copy link
Collaborator

SuGlider commented May 4, 2022

Regarding UART onReceive() check this #6134 and #6364

@handmade0octopus
Copy link
Author

@SuGlider I really try to port my code but its getting harder every second I look.

SD.remove() API has been changed and it requires new things now, I cant use my wifi portal (this may be due to SD card issues).

The heap problem was with the Wifi off (not initiated at all) with wifi on usage is similair fortunately but it doesn't work.

Is there a full list of functions that has been affected so I can go through and fit them? Even some things doesn't display right (one of my lcd.println(string) is affected - but only single one..).

Sorry for my frustration but it seems I am stuck with very old version as trying to find all bugs that update caused it's not feasible.

I understand you had to make some changes to make it easy to port to new IDF versions but it's so strange all the performance issues are worse (can't use 2.0.3 for now so I can't check).

Also I found the issue I had with large flash size - it was BluetoothSerial.h - once I commented it out flash size was exactly the same.

@VojtechBartoska
Copy link
Collaborator

VojtechBartoska commented May 4, 2022

@handmade0octopus Just go ahead with 2.0.3. And yes, all the changes are listed in particular releases.

Edit: What is the reason you cannot used 2.0.3? Or at least 2.0.2 with manually check out SD fixes?

@handmade0octopus
Copy link
Author

@VojtechBartoska PlatformioIO doesn't support it yet.

Please don't make out of me chooser beggar, I understand how it sounds. I am more than thankful for all hard work you guys do. I am just trying to use it and it saddens me seeing I have to rewrite code I worked for over 2 years on.

SD card isn't the only problem I have, there are many, many more I listed above. I don't know if changing approach on Serial will cause my program to work better, I think those simple read() commands should work as fast as any other - especially that I guard them with Serial.available() everywhere - this just doesn't make any sense to me sorry.

@SuGlider
Copy link
Collaborator

SuGlider commented May 4, 2022

I understand you had to make some changes to make it easy to port to new IDF versions but it's so strange all the performance issues are worse (can't use 2.0.3 for now so I can't check).

I'm not expert on PIO, but it seems that there may be a way to setup 2.0.3 in PIO.
It's necessary to change some PIO setup files (I have not tried it...).

Check these links:

#6647 (comment)
#6632 (comment)

@mrengineer7777
Copy link
Collaborator

@handmade0octopus Welcome to changing major rev numbers (1.X -> 2.X). Before you get too far, I recommend starting a completely separate project. That way you can start with a working base and slowly bring over functionality. Oh and use a VCS system to checkpoint your project. Github Desktop has been handy for me.

As a developer I've had to do that several times with various projects. Expect 1 to 2 weeks of work porting it. Or you can stick with 1.0.6. Your choice.

@Jason2866
Copy link
Collaborator

@handmade0octopus You can use the release 2.0.3 with Platformio.
Use this setup

platform = espressif32
platform_package = framework-arduinoespressif32 @ https:/espressif/arduino-esp32.git#142fceb8563cd1795d619829e0a103770a344e1a

@handmade0octopus
Copy link
Author

handmade0octopus commented May 4, 2022

You can use the release 2.0.3 with Platformio.

This doesn't work unfortunately. Still loads 2.0.2.

Expect 1 to 2 weeks of work porting it. Or you can stick with 1.0.6. Your choice.

Of course I am prepared to do that, but first I need to understand why my current Serial.available() in a loop and normal writing and reading is not performing as it should. Weather it is something to do with how it is structured or some internal timings.

I am just afraid that I will spend those 2 weeks porting it and issue is still persistent.

@SuGlider
Copy link
Collaborator

SuGlider commented May 5, 2022

@handmade0octopus

Can you give us more information about what you said that available/read/write are not performing as expected?

@gonzabrusco
Copy link
Contributor

@handmade0octopus You can use the release 2.0.3 with Platformio. Use this setup

platform = espressif32
platform_package = framework-arduinoespressif32 @ https:/espressif/arduino-esp32.git#142fceb8563cd1795d619829e0a103770a344e1a

You are missing an 's' in platform_packages

The correct way would be:

platform = espressif32
platform_packages = framework-arduinoespressif32 @ https:/espressif/arduino-esp32.git#142fceb8563cd1795d619829e0a103770a344e1a

@VojtechBartoska VojtechBartoska added the Area: Performance Issue related to performance problems and improvements label May 5, 2022
@SuGlider SuGlider self-assigned this May 5, 2022
@handmade0octopus
Copy link
Author

@gonzabrusco thanks it worked.

But found issue:

error: call of overloaded 'String(float, uint8_t)' is ambiguous

Had to change:

explicit String(float, unsigned int decimalPlaces = 2);
explicit String(double, unsigned int decimalPlaces = 2);

To:

explicit String(float, unsigned char decimalPlaces = 2);
explicit String(double, unsigned char decimalPlaces = 2);

In WString.h and .cpp or cast second parameter as (uint32_t)..

Potential wrapper using old method just calling new ones? Something to consider!

I like speed of SD card now, it's faster than before, WiFi connects faster.

Still VERY HIGH usage of heap and serial is performing badly.

I disabled GPS and it's not issue with it, it has to do something with how I handle LIN connection.

I would also reconsider how SD "remove" is handled as "SD.remove(file.name())" doesn't work anymore but leaves people without clue, I would add "change .name() to .path()" in debug messages as people can go around in circles by trying to solve it.

Regarding serial I believe it has something to do with .available() method (or how it blocks thread), Ill run some tests and let you know.

@Jason2866
Copy link
Collaborator

Jason2866 commented May 5, 2022

In project Tasmota we migrated since 2.0 alpha was out. Migrating the code from 1.0.x core to 2.0.x gave us better performance and MORE free heap and less flash usage. Migrating was sometimes a pain, using a alpha version... Imho you have librarys or/and in your code stuff which needs to be adopted to work (nice) with 2.0.3.
There is NO smooth change from 1.0.x to 2.0.x. It is to expect by a Major release chnage.

@handmade0octopus
Copy link
Author

Ok I investigated UART thing a little.

I will try to explain it to you in minimal simplified code - basically I run a task inside a loop:

// serial == Serial2.begin(9600, SERIAL_8E2);
xTaskCreatePinnedToCore(
		serialTask,
		"serialTask",
		8192,
		(void*) 0,
		8, // I set different priority - no change
		NULL,
		0); // I set different cores - no change

uint8_t hexAddress[255];
uint8_t response[255];

void serialTask(void* param) {
	while(1){
		ds2.sendCommand(hexAddress);
		uint32_t readResponse = ds2.receiveData(response);
		if(readResponse) {
			// do stuff
		}
		delay(1);
	}
}

uint8_t DS2::sendCommand(uint8_t command[], uint8_t respLen) {
	if(messageSend) {
		return 0;
	} else {
		if(respLen != 0) responseLength = respLen;
		timeStamp = millis();
		messageSend = true;
		respLen= serial.write(data, respLen);
		serial.flush(); // Commenting flush off actually improves performance!
		return respLen;
	}
}

uint8_t DS2::receiveData(uint8_t data[]) {
	log_e("Time: %d", millis() - timeStamp); // Here I log what I got below <===
	if(messageSend) {
		if(readData(data)) {
		        messageSend = false;
			return true;
		}
	}
	return false;
}

boolean DS2::readData(uint8_t data[]) {
	uint32_t startTime = millis();
	uint8_t availible = serial.available();
	if(availible > 2) {
		// do stuff
		return true;
	}
	return false;
}

I made this simple test setup.

Basically on version 1.0.5 I test I have very consistent:

[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7

And on 2.0.3 I have:

[877344][E][DS2.cpp:55] receiveData(): Time: 7
[877345][E][DS2.cpp:55] receiveData(): Time: 8
[877346][E][DS2.cpp:55] receiveData(): Time: 9
[877347][E][DS2.cpp:55] receiveData(): Time: 10
[877350][E][DS2.cpp:55] receiveData(): Time: 13
[877355][E][DS2.cpp:55] receiveData(): Time: 18

[877464][E][DS2.cpp:55] receiveData(): Time: 7
[877465][E][DS2.cpp:55] receiveData(): Time: 8
[877466][E][DS2.cpp:55] receiveData(): Time: 9
[877467][E][DS2.cpp:55] receiveData(): Time: 10
[877470][E][DS2.cpp:55] receiveData(): Time: 13
[877475][E][DS2.cpp:55] receiveData(): Time: 18

And if I remove flush() after write it gets even funnier!

On 1.0.5 consistent:

[E][DS2.cpp:55] receiveData(): Time: 1
[E][DS2.cpp:55] receiveData(): Time: 2
[E][DS2.cpp:55] receiveData(): Time: 3
[E][DS2.cpp:55] receiveData(): Time: 4

[E][DS2.cpp:55] receiveData(): Time: 1
[E][DS2.cpp:55] receiveData(): Time: 2
[E][DS2.cpp:55] receiveData(): Time: 3
[E][DS2.cpp:55] receiveData(): Time: 4

[E][DS2.cpp:55] receiveData(): Time: 1
[E][DS2.cpp:55] receiveData(): Time: 2
[E][DS2.cpp:55] receiveData(): Time: 3
[E][DS2.cpp:55] receiveData(): Time: 4

On 2.0.3 though:

[ 11468][E][DS2.cpp:55] receiveData(): Time: 1
[ 11469][E][DS2.cpp:55] receiveData(): Time: 2
[ 11470][E][DS2.cpp:55] receiveData(): Time: 3
[ 11471][E][DS2.cpp:55] receiveData(): Time: 4
[ 11474][E][DS2.cpp:55] receiveData(): Time: 7
[ 11478][E][DS2.cpp:55] receiveData(): Time: 11
[ 11483][E][DS2.cpp:55] receiveData(): Time: 16

[ 11589][E][DS2.cpp:55] receiveData(): Time: 1
[ 11590][E][DS2.cpp:55] receiveData(): Time: 2
[ 11591][E][DS2.cpp:55] receiveData(): Time: 3
[ 11592][E][DS2.cpp:55] receiveData(): Time: 4
[ 11595][E][DS2.cpp:55] receiveData(): Time: 7
[ 11599][E][DS2.cpp:55] receiveData(): Time: 11
[ 11604][E][DS2.cpp:55] receiveData(): Time: 16

Hope that helps you to try to tell me to what test next..

To be honest I am lost completely. Looks like new version sends (or receives) data much slower and can't cope with changing serial that fast.

@SuGlider
Copy link
Collaborator

SuGlider commented May 5, 2022

serial.flush(); // Commenting flush off actually improves performance!

I think you already know it, but flush() waits (blocking) until the UART TX FIFO is fully empty.

About the performance, I'll test it soon.
I can already tell you that 2.0.0+ version has a HardwareSerial API that uses extra buffers and queues within IDF UART Driver.

Arduino Corte Versions 1.0.6 or lower goes directly to the SoC Registers and ISRs. Thus, it may have a faster response.

But we have a special API for faster reading/response UART: HardwareSerial::onReceive(callbackFunc)

It will require a change on the logic of your application, but I think it will get the same performance as before.

@handmade0octopus
Copy link
Author

Can you roughly explain how is onReceive triggered?

Is it triggered when available is bigger than 1? Does it happen on separate thread or its blocking?

I will try to write code but it would be useful how should I handle it for my application. Will test and let you know.

@SuGlider
Copy link
Collaborator

SuGlider commented May 6, 2022

Can you roughly explain how is onReceive triggered?

When data arrives to UART, an ISR adds it to the internal RX buffer.
If there are no UART incoming data for more than a timeout equivalent, by default, to 10 UART symbols equivalent in time (depends on the baudrate) it will trigger onReceive.

The timeout, in Symbols, can be set by HardwareSerial::setRxTimeout()

Alternatively, if UART receives a stream of data (no pause, no timeout), it will also trigger it for every 120 bytes.

Is it triggered when available is bigger than 1?

No. It is not triggered based on available(). But you can read available() in the callback function.
All Serial interfaces can be used inside the callback function.

Does it happen on separate thread or its blocking?

non-blocking - asynchronous to Arduino main task execution.
It's a callback running from a separated task, used only to treat UART events, among them, the data received in the conditions described above.

List of PRs with information and examples:
#6134 adds onReceive()
#6364 adds std::function support to onReceive()
#6397 About setRxTimeout()

I will try to write code but it would be useful how should I handle it for my application. Will test and let you know.

OK. Thanks.

@handmade0octopus
Copy link
Author

@SuGlider rewriting code soon. Last question, can I use delays and/or semaphores in onReceive()?

@SuGlider
Copy link
Collaborator

SuGlider commented May 9, 2022

@handmade0octopus - When coding using onReceive() just have in mind that it runs on separated Highest priority Task, thus, if it doesn't returns or goes into Blocked, and Suspended Task state, Arduino main sketch Task won't run also...

delay() makes the running Task to go to Blocked state.
The same for a Semaphore waiting FreeRTOS function.

Thus those two will not be a problem.
But an infitite no ending loop will be a problem, because it will run for ever and Arduino Setup/Loop task won't run at all (lower priority).

@handmade0octopus
Copy link
Author

@SuGlider sorry not had chance to play with it yet.

But an infitite no ending loop will be a problem, because it will run for ever and Arduino Setup/Loop task won't run at all (lower priority).

Well, it won't work as it will definitely trigger WDT.

By the way I understand this onReceive() triggered task is using core 1 - is there a way to change it if I need to or do I need to modify source code?

@SuGlider
Copy link
Collaborator

By the way I understand this onReceive() triggered task is using core 1 - is there a way to change it if I need to or do I need to modify source code?

Check this PR #6685
It will allow the sketch to change the Running Core, Task Priority and Task Stack Size for the onReceive() event task.
This PR shall be available for Arduino Core v2.0.4

@VojtechBartoska VojtechBartoska added Status: Test needed Issue needs testing and removed Resolution: Awaiting response Waiting for response of author labels May 30, 2022
@VojtechBartoska VojtechBartoska added this to the 2.0.4 milestone May 30, 2022
@VojtechBartoska
Copy link
Collaborator

Adding this to 2.0.4 Milestone, when it's released this issue can be retested and probably closed.

@handmade0octopus
Copy link
Author

handmade0octopus commented May 31, 2022

@SuGlider @VojtechBartoska good news. Finally been able to test it (been on holiday).

Long story short - onReceive() gave 0 performance increase.

But I found that using Serial2.setRxTimeout(1); brought back old performance back.

So you can still use polling with this setup very well.

My advice would be to set RX timeout to 1 by default so users can port their code without need of debugging what is going wrong?

Heap usage is still big, 50kb is lost somewhere.

Also, is there any #define where I can check for arduino version?

@lbernstone
Copy link
Contributor

ESP_ARDUINO_VERSION

schreibfaul1 added a commit to schreibfaul1/ESP32-MiniWebRadio that referenced this issue Jun 1, 2022
performance issuse with new Arduino version
if Webpage is open, image transfer is slow, not Wifi but SD or SD_MMC
espressif/arduino-esp32#6689
@SuGlider
Copy link
Collaborator

SuGlider commented Jun 1, 2022

Heap usage is still big, 50kb is lost somewhere.

Interesting... this issue with heap has been fixed a while ago.
Is it possible for you to compile the project using the Arduino IDE with Arduino Core 2.0.3?
Or at least a piece of your project that could be compared to the same piece built with PIO?

I think that PIO may not be using the same pre-compiled libraries and compiling options that Arduino IDE uses.

@Jason2866
Copy link
Collaborator

PIO does use the same pre-compiled libraries. If in doubt the release 2.0.3 from here can be used with the following setup.

platform = espressif32 @ 4.4.0
platform_packages = framework-arduinoespressif32 @ https:/espressif/arduino-esp32/releases/download/2.0.3/esp32-2.0.3.zip

@handmade0octopus
Copy link
Author

@Jason2866 I used most recent version using:

platform = espressif32
platform_packages = framework-arduinoespressif32 @ https:/espressif/arduino-esp32.git

So I don't think this is the issue.

@SuGlider I could potentially give you access to my private repo so you can download and build beta branch but I would have to trust you that you won't:

  1. Leak it
  2. Make fun of my bad coding :D

@SuGlider
Copy link
Collaborator

SuGlider commented Jun 2, 2022

@SuGlider I could potentially give you access to my private repo so you can download and build beta branch but I would have to trust you that you won't:

1. Leak it

Wow, I prefer not to have access to private code or sensitive information. Thanks for the trust!

2. Make fun of my bad coding :D

Never! :-)

Anyway, I undesrtand that the issue is solved. It that right?

@handmade0octopus
Copy link
Author

@SuGlider I plan to release this code MIT at some point in the future so it's not that bad. Also you have your linkedin linked to your github so not hard to find you! Also every code is open source if you know assembler! :D

Yes serial issue is fixed, I will keep an eye on heap usage.

Is there a chance you could show me how to get certain arduino versions like in example above?

platform_packages = framework-arduinoespressif32 @ https:/espressif/arduino-esp32.git#142fceb8563cd1795d619829e0a103770a344e1a

Where to get this number after hash? I want to test different versions to test heap usage, maybe this way I could find version where issues started.

Also do I understand correctly Serial2.setRxTimeout(1); will be added as default in upcoming versions? (just so I need to remember about it starting new projects).

@lbernstone
Copy link
Contributor

That number is a github commit number. If you go to the releases page, you will see the 7-digit commit number in the left column under the tag string. Click on it, and you will see the full number (which starts with those 7 digits) in the right side at the top.

@SuGlider
Copy link
Collaborator

SuGlider commented Jun 3, 2022

@SuGlider I plan to release this code MIT at some point in the future so it's not that bad. Also you have your linkedin linked to your github so not hard to find you! Also every code is open source if you know assembler! :D

:-) Great!
Let us know when your project goes public!

Yes serial issue is fixed, I will keep an eye on heap usage.

Happy to know that.

Is there a chance you could show me how to get certain arduino versions like in example above?

platform_packages = framework-arduinoespressif32 @ https:/espressif/arduino-esp32.git#142fceb8563cd1795d619829e0a103770a344e1a

Where to get this number after hash? I want to test different versions to test heap usage, maybe this way I could find version where issues started.

I'm not expert in PIO, but I can tell you that 142fceb8563cd1795d619829e0a103770a344e1a is the hash code from a specific commit in Github. In that case it is the 2.0.3 commit released.
You can go to https:/espressif/arduino-esp32/releases and see that each commit has a hash code associated to the release version.

image

Position the mouse over that link and you will see the full hash number.

image

Also do I understand correctly Serial2.setRxTimeout(1); will be added as default in upcoming versions? (just so I need to remember about it starting new projects).

HardwareSerial::setRxTimeout() is already in the release 2.0.3.
Thus, it has already been added. Is that your question?

@handmade0octopus
Copy link
Author

HardwareSerial::setRxTimeout()

@SuGlider I meant if value of 1 will be default value for HardwareSerial::setRxTimeout(1) since it brought back right performance for it?

@SuGlider
Copy link
Collaborator

SuGlider commented Jun 3, 2022

HardwareSerial::setRxTimeout()

@SuGlider I meant if value of 1 will be default value for HardwareSerial::setRxTimeout(1) since it brought back right performance for it?

I see. I remember that IDF sets it to 10, for some reason...
I need to investigate it further. If there is no issue associated and the performance is better, I agree that it shall go to 1 as default.

@SuGlider
Copy link
Collaborator

SuGlider commented Jun 4, 2022

@handmade0octopus @gonzabrusco
I have investigated the default value of 10 for RX Timeout and I think that it can go to 1, instead of 10.
I also will introduce the IDF call to uart_set_always_rx_timeout(uart_num, true); in the Arduino Core.
This will trigger timout even when RX FIFO is full, which is not activated by default.
There are some issues with 120 bytes in FIFO, not triggering available() and so on...

Please let me know your point of view about this change.
I want to know if you may have some considerations about it.

Thanks.

@handmade0octopus
Copy link
Author

handmade0octopus commented Jun 4, 2022

Ok I am doing tests of different versions:

Commit Max heap Free heap
platform_packages = framework-arduinoespressif32 (most recent) 63.99 KB 164-165 KB
063119a 107.99 KB 162-164 KB
bd2be80 63.99 KB 162-164 KB
2673b88 63.99 KB 165.20 KB
platform = [email protected] and @3.1.0 111 KB 203 KB

Hard to test as one commit downloads like 1h due to low speed (around 500 kb/s)

@SuGlider I have question though, is new Arduino version more optimised for uint32_t size to have faster memory access?

If so I can live with this 50 KB of more usage. But is it so?

Should we open another issue for this? Not sure how to best use Github.

@gonzabrusco
Copy link
Contributor

me know your point of view ab

No problem for me if the Arduino behaviour stays the same and the setRxTimeout keeps working (I set the timeout to 3 characters for modbus).

Just one note, one time I tried setting the timeout to 1 character and I noticed A LOT of false triggers of the callback. With a symbol time of 2 it stopped happening. Maybe at 1 it gets TOO sentitive... not sure if it's the safest default value.

@VojtechBartoska VojtechBartoska removed this from the 2.0.4 milestone Jun 15, 2022
@SuGlider
Copy link
Collaborator

We have done a few updates to UART in the new Arduino Core 2.0.5 version.
There are new functions that may help your applications:

  • void onReceive(OnReceiveCb function, bool onlyOnTimeout = false);
    onReceive() will setup a callback that will be called whenever an UART interruption occurs (UART_INTR_RXFIFO_FULL or
    UART_INTR_RXFIFO_TOUT). This means that it is possible to set up a callback that will be executed as soon as data is
    received.
    UART_INTR_RXFIFO_FULL interrupt triggers at UART_FULL_THRESH_DEFAULT bytes received (defined as 120 bytes by
    default in IDF). This default value of 120 can be changed using setRxFIFOFull() with a value from 1 to 127.
    UART_INTR_RXFIFO_TOUT interrupt triggers at UART_TOUT_THRESH_DEFAULT symbols passed without any reception
    (defined as 2 symbols by default). This can also be changed with setRxTimeout()

    onlyOnTimeout parameter will define how onReceive will behave:
    true -- The callback will only be called when RX Timeout happens.
    Whole stream of bytes will be ready for being read on the callback function at once.
    This option may lead to Rx Overflow depending on the Rx Buffer Size and number of bytes received
    in the streaming
    Default: false -- The callback will be called when FIFO reaches RXFIFO_FULL bytes and also on RX Timeout.
    The stream of incommig bytes will be "split" into blocks of minimum RXFIFO_FULL bytes on each callback.
    This option avoids any sort of Rx Overflow, but leaves the UART packet reassembling work
    to the Application.

  • void onReceiveError(OnReceiveErrorCb function);
    onReceiveError() will be called on error events (see hardwareSerial_error_t) and it will also call the onReceive() callback in
    case some was defined. By this, it is possible to read data using onReceive() callback when a BREAK is sent to UART, for
    instance.

  • void setRxTimeout(uint8_t symbols_timeout);
    setRxTimeout() sets the timeout after which onReceive callback will be called (after receiving data, it waits for this time of UART rx inactivity to call the callback fnc)
    param symbols_timeout defines a timeout threshold in uart symbol periods.
    Setting 0 (zero) symbol timeout disables the callback call by timeout.
    Maximum timeout setting is calculacted automatically by IDF. If set above the maximum, it is ignored and an error is printed on Serial0 (check console).
    For example symbols_timeout=1 defines a timeout equal to transmission time of one symbol (~11 bit) on current baudrate.
    For a baudrate of 9600, SERIAL_8N1 (10 bit symbol) and symbols_timeout = 3, the timeout would be 3 / (9600 / 10) = 3.125 ms

  • void setRxFIFOFull(uint8_t fifoBytes);
    setRxFIFOFull() will set the number of bytes that will trigger UART_INTR_RXFIFO_FULL interrupt and fill up RxRingBuffer
    This affects some functions such as Serial::available() and Serial.read() because, in a UART flow of receiving data,
    Serial internal RxRingBuffer will be filled only after these number of bytes arrive or a RX Timeout happens.
    This parameter can be set to a low value, such as 1 in order to receive byte by byte, but it will also consume more
    CPU time as the ISR will be activates often.

@VojtechBartoska VojtechBartoska changed the title Performance issues with new version UART Performance issues with 2.0.2 version Sep 21, 2022
@VojtechBartoska VojtechBartoska added this to the 2.0.6 milestone Sep 21, 2022
@SuGlider
Copy link
Collaborator

Ok I investigated UART thing a little.

I will try to explain it to you in minimal simplified code - basically I run a task inside a loop:

// serial == Serial2.begin(9600, SERIAL_8E2);
xTaskCreatePinnedToCore(
		serialTask,
		"serialTask",
		8192,
		(void*) 0,
		8, // I set different priority - no change
		NULL,
		0); // I set different cores - no change

uint8_t hexAddress[255];
uint8_t response[255];

void serialTask(void* param) {
	while(1){
		ds2.sendCommand(hexAddress);
		uint32_t readResponse = ds2.receiveData(response);
		if(readResponse) {
			// do stuff
		}
		delay(1);
	}
}

uint8_t DS2::sendCommand(uint8_t command[], uint8_t respLen) {
	if(messageSend) {
		return 0;
	} else {
		if(respLen != 0) responseLength = respLen;
		timeStamp = millis();
		messageSend = true;
		respLen= serial.write(data, respLen);
		serial.flush(); // Commenting flush off actually improves performance!
		return respLen;
	}
}

uint8_t DS2::receiveData(uint8_t data[]) {
	log_e("Time: %d", millis() - timeStamp); // Here I log what I got below <===
	if(messageSend) {
		if(readData(data)) {
		        messageSend = false;
			return true;
		}
	}
	return false;
}

boolean DS2::readData(uint8_t data[]) {
	uint32_t startTime = millis();
	uint8_t availible = serial.available();
	if(availible > 2) {
		// do stuff
		return true;
	}
	return false;
}

I made this simple test setup.

Basically on version 1.0.5 I test I have very consistent:

[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7
[E][DS2.cpp:55] receiveData(): Time: 7

And on 2.0.3 I have:

[877344][E][DS2.cpp:55] receiveData(): Time: 7
[877345][E][DS2.cpp:55] receiveData(): Time: 8
[877346][E][DS2.cpp:55] receiveData(): Time: 9
[877347][E][DS2.cpp:55] receiveData(): Time: 10
[877350][E][DS2.cpp:55] receiveData(): Time: 13
[877355][E][DS2.cpp:55] receiveData(): Time: 18

[877464][E][DS2.cpp:55] receiveData(): Time: 7
[877465][E][DS2.cpp:55] receiveData(): Time: 8
[877466][E][DS2.cpp:55] receiveData(): Time: 9
[877467][E][DS2.cpp:55] receiveData(): Time: 10
[877470][E][DS2.cpp:55] receiveData(): Time: 13
[877475][E][DS2.cpp:55] receiveData(): Time: 18

And if I remove flush() after write it gets even funnier!

On 1.0.5 consistent:

[E][DS2.cpp:55] receiveData(): Time: 1
[E][DS2.cpp:55] receiveData(): Time: 2
[E][DS2.cpp:55] receiveData(): Time: 3
[E][DS2.cpp:55] receiveData(): Time: 4

[E][DS2.cpp:55] receiveData(): Time: 1
[E][DS2.cpp:55] receiveData(): Time: 2
[E][DS2.cpp:55] receiveData(): Time: 3
[E][DS2.cpp:55] receiveData(): Time: 4

[E][DS2.cpp:55] receiveData(): Time: 1
[E][DS2.cpp:55] receiveData(): Time: 2
[E][DS2.cpp:55] receiveData(): Time: 3
[E][DS2.cpp:55] receiveData(): Time: 4

On 2.0.3 though:

[ 11468][E][DS2.cpp:55] receiveData(): Time: 1
[ 11469][E][DS2.cpp:55] receiveData(): Time: 2
[ 11470][E][DS2.cpp:55] receiveData(): Time: 3
[ 11471][E][DS2.cpp:55] receiveData(): Time: 4
[ 11474][E][DS2.cpp:55] receiveData(): Time: 7
[ 11478][E][DS2.cpp:55] receiveData(): Time: 11
[ 11483][E][DS2.cpp:55] receiveData(): Time: 16

[ 11589][E][DS2.cpp:55] receiveData(): Time: 1
[ 11590][E][DS2.cpp:55] receiveData(): Time: 2
[ 11591][E][DS2.cpp:55] receiveData(): Time: 3
[ 11592][E][DS2.cpp:55] receiveData(): Time: 4
[ 11595][E][DS2.cpp:55] receiveData(): Time: 7
[ 11599][E][DS2.cpp:55] receiveData(): Time: 11
[ 11604][E][DS2.cpp:55] receiveData(): Time: 16

Hope that helps you to try to tell me to what test next..

To be honest I am lost completely. Looks like new version sends (or receives) data much slower and can't cope with changing serial that fast.

A few notes:

serial.flush(); // Commenting flush off actually improves performance!
Indeed, flush() will block execution until all data is moved out through UART.

Please read this post: #6921 (comment)
Changing FIFO Full parameter may improve performance, depending on the desired behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Performance Issue related to performance problems and improvements Status: Test needed Issue needs testing
Projects
Development

No branches or pull requests

7 participants