Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0x1E doesn't trigger inst_x #3

Open
jcubic opened this issue Dec 27, 2020 · 6 comments
Open

0x1E doesn't trigger inst_x #3

jcubic opened this issue Dec 27, 2020 · 6 comments

Comments

@jcubic
Copy link

jcubic commented Dec 27, 2020

It seems that it should I have code that render ANSI artwork and I need to map this character into triangle. but the callback is not triggered for that ASCII control character.

Offending artwork https://16colo.rs/pack/laz12/fil-POWSTAC.ans

@jcubic
Copy link
Author

jcubic commented Dec 27, 2020

I've needed to run this code before the parser to fix the ANSI artwork.

var cp_437_control = {
    0x00: ' ',
    0x0F: '*',
    0x12: '↕',
    0x18: '↑',
    0x19: '↓',
    0x11: '◄',
    0x1E: '▲',
    0x10: '█'
};
var cp_437_keys = Object.keys(cp_437_control);
if (ansi_art) {
    cp_437_keys.forEach(function(key) {
        var re = new RegExp(String.fromCharCode(key), 'g');
        input = input.replace(re, cp_437_control[key]);
    });
}

This should be handled by the parser.

@jcubic
Copy link
Author

jcubic commented Dec 27, 2020

Adding EXECUTABLES.push(0x1E); fixed the issue. Any reason why not all control codes in ASCII are executable?

@jerch
Copy link
Collaborator

jerch commented Dec 27, 2020

Thats a bug in the transition table - all separators (FS, GS, RS, US) are missing. They are not meaningful in DEC terminals, thus got never mapped to anything. In xterm.js they are treated as executables without any control function (thus get effectively swallowed).

How to go about them depends on the domain - they were used in older document formats for limited structuring (as record or field delimiters by mainframes and library catalogue systems) and their visual representation is not defined in a general purpose way (might be used for pagination, field indentation, whatsoever ... depending on the document type).

Possible fixes in terms of a general purpose sequence parser:

  1. add them to exetutables and implement their control function, prolly with switch for document speciific handling and a one-fits-all fallback, for document specific treatment you'd need some way to announce the document type to the control functions
  2. add them to printables and treat them by a later document specific parser from the inst_p call (like xterm.js handles some unicode controls later on buffer level)

Note - there is a chance for more missing controls, as I developed the transition table from a global [error, GROUND] transition by filling in defined stuff from DEC terminals. Best way to find those missing bits is to check for the error action inst_E.

@jcubic
Copy link
Author

jcubic commented Dec 27, 2020

inst_E is nice tip, didn't know about it. As for detecting documents I'm only handling ansiArt and I use a flag passed as option.
I'm using this code:

                inst_x: function(flag) {
                    var code = flag.charCodeAt(0);
                    if (code === 13) {
                        this.cursor.x = 0;
                    } else if (code === 10) {
                        this.cursor.y++;
                        if (!use_CR) {
                            this.cursor.x = 0;
                        }
                    } else if (code === 9) {
                        print.call(this, '\t');
                    } else if (ansi_art && code in cp_437_control) {
                        print.call(this, cp_437_control[code]);
                    } else if (DEBUG) {
                        var mod = code % characters.length;
                        var char = characters[mod];
                        print.call(this, char);
                    }
                    if (!this.result[this.cursor.y]) {
                        this.result[this.cursor.y] = '';
                    }
                },

If ansi_art mode is enabled I just map inst_x code into inst_p. I was using DEBUG to detect what was the code and how to print it, since I've got only visuals so I could compare two images and see the mapping of two characters between two images.

@jerch
Copy link
Collaborator

jerch commented Dec 27, 2020

Yes looks good that way. Yeah the parser does not know anything about CPxxx specific mappings as it always operates on unicode.

Minor sidenote on the CP437 replace above - is the regexp really needed there? Wouldn't a simple loop be much more performant? Something like this:

for (let i = 0; i < cp_437_keys.length; ++i) {
  input = input.replace(cp_437_keys[i], cp_437_control[cp_437_keys[i]]);
}

Edit: Or if you really want to go with one regexp something like this:

var cp_437_control = {
    '\x00': ' ',
    '\x0F': '*',
    '\x12': '↕',
    '\x18': '↑',
    '\x19': '↓',
    '\x11': '◄',
    '\x1E': '▲',
    '\x10': '█'
};
input = input.replace(/[\x00\x0F\x12\x18\x19\x11\x1E\x10]/g, c => cp_437_control[c]);

@jcubic
Copy link
Author

jcubic commented Dec 27, 2020

Actually I've removed the regex it was just first patch over ansiParser now after I've added this line:

EXECUTABLES.push(0x1E)

my only code is:

var code = flag.charCodeAt(0);
if (ansi_art && code in cp_437_control) {
  print.call(this, cp_437_control[code]);
}

in inst_x, regex code was added first just before I've found that I can just add missing executable key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants