ROOL Forum: Recent Posts

Basic Compiler object for Pi 4

Thu, 09 Jul 2020 19:44:31 -0000

I have Version 4.18 (25 Nov 2016) of the !ABC Basic Compiler and was pleasantly surprised to see that it works on a Pi 4.

Following the section “How to Make a Module” in the User Guide of my old Archimedies Basic Compiler, I can compile a Module task and produce a 26bit compatible module which Runs quite happily under Aemulor72.

Is there a later version of the compiler which will produce 32bit output?

CObey

Thu, 09 Jul 2020 19:44:01 -0000

To be clear, I’m talking about Simon Bullen’s ‘Fortify’ header and C source files which I’ve used in a huge amount of RISC OS stuff, and is part of my libraries that I use (and therefore part of the build system – that’s why it’s really just a matter of using `FORTIFY = yes` in the makefile to make it work). Other things called Fortify exist, which is why I’m explaining that.

Many instances of it exist on the Internet, and just searching for “Header file for fortify.c” will find it used in many places. To actually use it you `#define FORTIFY` (or define that in your compiler invocation) and `#include “fortify.h”` in your source (and obviously build the fortify.c as well). It then replaces your use of malloc and free (and friends) with its own tracking versions which will then warn you when you do things oddly. Modern C libraries and compilers have options to do some of this for you, too.

I’m not sure of a canonical location for the source; it’s under a slightly different license to what you might be used to in that it declares what you cannot do with it, like distributing modified versions. Which is sad, because it’s a great little library, and there have been some cool things done with it (I built some nice hierarchical diagnostics from it back when I was at Picsel).

Because of that Artifex (who develop Ghostscript and muPDF) created an entirely distinct implementation, which follows the spirit of Fortify, called Memento. However, that has its own restriction that if you distribute things with it you’ve got to place them under the AGPL, which is an even more frustrating license. Fortunately, if you use it for development and don’t actually distribute anything built with it, you’re under no obligations.

Hearsay

Thu, 09 Jul 2020 19:32:48 -0000

I’m back in the game. Despite DDT stopping when it met a Desk function the problem was that Connector defaults to using the InternalPC serialdev which is 26bit. So I’ve defaulted to Internal32 – that crashed but I found the offending 26bit instruction. Not many serialdevs work on the Pi4 but Internal32 and SerialUSB do for me. For some reason the Dummy SerialDev also fails.

Anyway back to the beginning would anyone like to try ConnectorTest. Hopefully it works now. If you are already using SerialDev double clicking on the attached version will now load the updated driver. The SerialUSB module is in !SerialDev and is loaded by the SerialDev driver.

I’ve changed the default settings to be more useful so if you need 115200 8N1 it should work straight away. If you have saved Connector settings those will be used but if a driver other than InternalPC or SerialUSB is loaded it may well crash.

Module termination

Thu, 09 Jul 2020 19:32:41 -0000

Something that has always bugged me about modules in RISC OS (other than SVC mode!) is that there’s no way for a module to legitimately self terminate. There are all sorts of hacks (execute RMKill as an OSCLI, set up a tiny bit of code in the RMA to do it on a callback, etc etc) but they all suffer from either potentially re-entering a removed/unlinked module or small memory leaks from helper code that can’t be freed from within itself (though I’ve seen one that frees its own RMA allocation irregardless 1).

Wouldn’t it be so much simpler if there was a SWI that could mark a module as pending removal, and upon exit from SVC mode (ie the module), it is removed.

1 I’m using this word to wind up those who don’t think it’s a real word.

Thought on Mulit Core Support:

Thu, 09 Jul 2020 19:24:04 -0000

What I just didn’t get is who is the “master” core and who are he “slaves” as in Windows you can create as much threads as you want

I don’t think there is any special distinction. And sensible OS will try to balance out to maximise hardware utilisation.
Remember, also, that Windows allows process priorities to be set. My DVD ripper runs with “above average” priority.

But as I said you can also directly assign a thread to a specific core

It’s called “affinity”. Can be useful when there are two things that need to run side by side but keep in step. Usual example is watching video where one thread deals with decoding the video data and another decodes the audio. If they are running on separate cores, then there’s less issue of them fighting themselves for available resources when any other core that might be present is doing something mundane.

but with limitations on available SWIs.

I think the first thing to consider is what we mean by “thread”. It would work, and be viable, to have the parent app set up two blocks of shared memory (an input and an output), stuff some data into a structure that can be read by the thread (say, the start of the output buffer), and then just fire up the thread. Thread loads data into registers and does whatever it does with no OS calls at all. Thread runs to completion.
That would be feasible for simple image decoding (say, converting something into a 16m colour sprite). Or for scaling an image. Sharpening. Guassian blur. Etc etc.

So, step one – what exactly do we mean by “thread” in this context?

Thought on Mulit Core Support:

Thu, 09 Jul 2020 18:30:21 -0000

The answer could be light threads. Threads, but with limitations on available SWIs. More AMP than SMP, but close to what Jeffrey did a few years ago. And it will break nothing.

Does someone has a Pi multicore ROM? It was on my list to try a few things from C.

I love too the Michael’s approach with complete OSes working on other cores (NetBSD, RISC OS), and I hope the two will work together.

Thought on Mulit Core Support:

Thu, 09 Jul 2020 18:24:18 -0000

To me that pthread thing looks like a good approach. I wouldn’t expect any existing software to benefit from multicore. Also in x86/Windows code has to be written to use threads within.

Several years ago I wrote a Mandelbrot Benchmark for x86 in Assembler (Link) and some guys helped me with the threading routine. It will read the amount of available cores and then sets up the same amount of threads and assigns each thread to a specific core number. Each thread calculates a line on the screen, once it’s done a global counter in memory is incremented (by a ‘lock’ command, I guess there could be similar stuff in ARM to prevent different cores writing to the same memory address at the same time). What I just didn’t get is who is the “master” core and who are he “slaves” as in Windows you can create as much threads as you want (may be limited to some number, I guess) and it will distribute it somehow to the cores if available or not anyway.

But as I said you can also directly assign a thread to a specific core (to me it looks similar like creating an interrupt routine). I guess some approach like this would be very transparent and may be more easily to do for Risc OS ? May be that pthread works similar ?

Thought on Mulit Core Support:

Thu, 09 Jul 2020 17:43:34 -0000

I think like Rick Murray ( and Jeffrey’s idea is the same if i get it right) described it in the last comment is the correct approach so nothing gets broken for existing programs. Only programs that are “multicore ready” will benefit from Multicore CPU’s. Actually the multicore API has nothing to do with the actual WIMP poll. Only the apps access the multicore API during the WIMP poll.

Only question how the applications get the data back from the other cores and when.

But I think the app should be aware of how many cores are avaiable as the application must split the task according to the availale numbers of cores. At least I believe so it should work for processing images that one image is cut into into different pieces (5 cores = 5 pieces). So there should be a call like MultiCoreAPI_GetCoreNumbers that returns the available cores. Thats might vary also if a core is used already by another app (lets say a browser).

Slow Pandaboard Transfer Speeds

Thu, 09 Jul 2020 17:33:45 -0000

A FIQProf profile of a (single-tasking?) file upload/download over a high-latency link might allow you to quickly identify where all the busy-wait loops are and what the triggers are for leaving those loops.

Under RPCEmu RISC OS consistently responds to the sender in under 2 milliseconds, and no packets are dropped. The delay is waiting for the network cables and sender to send more data.

So basically my contribution is “stop speculating and profile the code, you big dummies!” ;-)

A packet sniffer is more useful.

The problem is that RISC OS never sends a window size larger than 17376 bytes, so the sender correctly never sends more than 17376 bytes beyond what it has received acknowledgements for. The window size and receive buffer need to be a few megabytes in size.

Trying to adjust the receive buffer using sysctl doesn’t help:

sysctl -w net.inet.tcp.recvspace=100000

Attempting to set a 1 megabyte receive buffer using sysctl causes RISC OS to fail to create a TCP socket:

sysctl -w net.inet.tcp.recvspace=1000000

Adjusting the receive buffer with setsockopt does speed it up:

size_t bufsize = 262144;
setsockopt(s, SOL_SOCKET, SO_RCVBUF, &bufsize, sizeof(bufsize));

However attempting to set any value larger than 262144 returns ENOBUFS (No buffer space available), and needs to be a lot larger, about 2.5 megabytes for a 100Mbps round the world connection.

Trying likewise for the send buffer breaks my test case, it closes the connection without sending anything.

Making the default large enough would however quickly exhaust the available amount of memory. At least Linux and FreeBSD adjust the size of the buffers to match network conditions.

Generally you get to the stage where a packet buffer isn’t available and you have to drop the packet

without any way to stop TCP retries (flow control)

Do you have any evidence of RISC OS dropping packets on a TCP connection? A slow application should not cause TCP retries, instead the TCP window size sent to the sender should be reduced.

Thought on Mulit Core Support:

Thu, 09 Jul 2020 15:41:23 -0000

But I think it would be tricky to implement in a way that won’t break existing apps (just take a look at Wimp2).

The thing with Wimp2 was that it tried to introduce new behaviour to existing apps.

What a multicore Wimp should do for existing apps is nothing. They won’t run faster, they’ll just be the same as they are now.
An extended API should be available for apps that support the concept of threads (I think you’ll find the majority of current Wimp apps do not, it’s more a case of “in response to this event, do this”) can notify the Wimp that threads are supported, and then inform the Wimp when a thread needs to be run, likely as a distinct process from the app itself (leaving the app to deal with the messages and events and other UI stuff) and the Wimp can juggle these threads around the available cores.
It should probably include some form of basic preemption, because “one” is a valid number of cores. Whether there’s one or twenty, it’s not something for the app to be concerned with.
Said thread will run and either explicitly yield or be interrupted, until it terminates. Distinct from the parent app.

How tricky could that be? ;-)

The problem is that some bits of the Wimp architecture are deterministic. Think of Wimp_MessageRecorded or drag saving.

Thought on Mulit Core Support:

Thu, 09 Jul 2020 14:24:20 -0000

But I think it would be tricky to implement in a way that won’t break existing apps (just take a look at Wimp2).

However if the focus was to only dynamically increase the size of the WIMP poll loop across different cores, processors. How tricky could that be? ;-)

The pthreads route also sounds like a good approach… :-)