Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Educational Exercises and Activities #153

Open
AdamSobieski opened this issue May 14, 2024 · 14 comments
Open

Educational Exercises and Activities #153

AdamSobieski opened this issue May 14, 2024 · 14 comments

Comments

@AdamSobieski
Copy link

AdamSobieski commented May 14, 2024

Introduction

Hello. I am pleased to share some brainstorming towards advancing the state of the art with respect to educational exercises and activities, e.g., homework, quiz, and exam items, sequences of such items, and interoperability with tutoring agents.

Experience API

The Experience API (xAPI) is an e-learning software specification that records and tracks various types of learning experiences for learning systems. Learning experiences are recorded in a Learning Record Store (LRS), which can exist within traditional learning management systems (LMSs) or on their own.

See also: xAPI.js.

Items

Items, e.g., homework items, can be HTML5-based resources so as to be able to utilize hypertext, fonts, stylesheets, scripts, metadata, images, animations, 3D graphics, audio and video.

Items may stream xAPI events to one or more learner-configured LRS's as they are interacted with by learners and upon completion, e.g., as they are answered by learners.

Items should also utilize JavaScript to signal upon completion that a next item can be presented to a learner. In this completion signal, an object may be passed as an argument to be returned to the item-sequencing control logic.

Sequences of Items

Sequences of items could be bundled into OCF-based archives containing HTML5-based items and their multimedia resources.

JavaScript could be utilized to express both static and dynamic, e.g., adaptive, sequences of items. The following example shows a simple item sequence comprised of four items:

@metadata('meta.json')
async function main()
{
    let result;

    result = await env.next('item1.html');
    result = await env.next('item2.html');
    result = await env.next('item3.html');
    result = await env.next('item4.html');
}

The presentation of sequences of items may not require any local or remote service to provide readers with adaptive or personalized item sequences, may optionally utilize one or more local or remote services, may require access to one or more local or remote services to function, or may operate while offline, storing educational data locally, while expecting to connect to the Internet at a later point.

See also: APIs related to navigation and session history
See also: Infrastructure for sequences of documents

Tutoring Agents

A bidirectional communication is envisioned between items and tutoring agents.

Item-to-tutor Communication

Web browsers are desired to be able to share items' events (e.g., initialization, finalization) and data with tutoring agents.

These data could be used to populate portions of LLMs' prompts or to enhance multimodal dialogue contexts.

These data could include items' natural-language educational objectives, instructions, descriptions, and hints.

These data could include sets of described "landmarks" in items' accompanying multimedia resources. Capable tutoring agents could make use of these described "landmarks" to select, show, and highlight content in items' multimedia resources.

These data could include sets of described input fields. Capable tutoring agents could make use of these to enter data obtained through dialogue into the input fields on learners' behalf.

These data could include hints for learners.

Tutor-to-item Communication

Selecting and Highlighting Items' Landmarks

A mathematics homework item, for example, might have an accompanying illustration depicting a right triangle with three sides and three angles. A tutoring agent would be able to select, show, and highlight any of these six described "landmarks" in the illustration.

A physics homework item, for example, might have an accompanying illustration depicting two penguins, a rope, and a pulley. A tutoring agent would, similarly, be able to select, show, and highlight any of these four described "landmarks" in the illustration.

These approaches would be desired to work with 2D content, stateful multimedia, animations, and 3D graphics visualizations.

Entering Data into Items' Input Fields

Tutoring agents could also select multiple-choice answers and populate items' input fields with learner-specified content on learners' behalf. That is, tutoring agents could serve as natural-language user interfaces to homework items.

A learner might, for example, verbally tell a tutoring agent that an item's right triangle's hypotenuse was "5", and that tutoring agent could enter that specified value into an appropriate input field of the item.

Other Technical Discussion Topics

Artificial Intelligence

Artificial intelligence systems could utilize homework items as training data and interact with and solve these items.

Interprocess / Interapplication Communication

A tutoring agent and educational content, e.g., homework items, could be provided in one browser tab. However, these software components (e.g., digital textbooks, tutoring agents) would probably be from separate vendors. These software components could also be provided in two or more interoperating browser tabs (e.g., in separate browser windows).

In particular if a tutoring agent was not Web-based, interprocess / interapplication communication between tutoring agents' client applications and Web browsers would benefit these educational exercises and activities scenarios.

Services

Services, e.g., one or more xAPI LRS's, could be managed by platforms and subsequently loaded by and utilized by educational resources through interfaces. In this way, learners would not have to repeatedly log on to or connect to services, e.g., LRS's, per educational resource or activity therein.

Preferences, Settings, and Configuration

Educational resources could store preferences, settings, and configuration on platforms, each having one or more (e.g., URI-based) keys, and each being available via one or more hierarchical paths (pages and nested sections of settings) which could be utilized for access control, navigational, and display purposes.

Learners could search for, retrieve, navigate to, and access (e.g., read, write) extensible preferences, settings, and configuration using their platforms' unified settings areas.

Nested Frames

As considered, sequences of items displayed to learners may make use of nested frames.

Conclusion

Thank you. Per the WICG proposal process,

  1. Submit a proposal outlining your idea.
  2. Get feedback and improve your proposal.
  3. Find collaborators and create a GitHub repository.
  4. Work on your proposal and seek consensus from the community.
  5. Advocate for adoption of your proposal to the W3C or the WHATWG for standardization.

I am looking forward to discussing and improving this preliminary proposal with your feedback and to finding interested collaborators to create fuller documents with which to spur innovation and to seek consensus from the community and stakeholders.

@marcoscaceres
Copy link
Contributor

Hi Adam,

Thank you for your proposal. I wanted to point out that everything you're suggesting can already be implemented using JavaScript without requiring changes to HTML standards or additional browser support.

Accessing Global Objects in iframes

  • Same-Origin Frames: Direct access to the parent’s global objects using window.parent.
  • Cross-Origin Frames: Communication is possible using the postMessage API, adhering to the same-origin policy for security.

Custom Formats and SPARQL

  • Custom data formats like .n3 and querying languages such as SPARQL can be handled entirely in JavaScript. This allows for parsing, querying, and manipulating data without the need for standardization or native browser support.

Fetching Network Resources

  • The Fetch API provides a powerful and flexible way to make network requests and handle responses, supporting a wide range of use cases without requiring any additional browser features.

Security Considerations

  • The same-origin policy is crucial for protecting sensitive data, and existing APIs like postMessage ensure secure cross-origin communication.

By leveraging JavaScript, you can achieve the functionality you're looking for without necessitating changes to existing HTML standards.

If there are specific scenarios where these capabilities seem insufficient, please share more details so we can explore them further.

@AdamSobieski
Copy link
Author

It does appear that that initial set of features can be implemented via a set of JavaScript libraries.

I have since updated the proposal with a few new ideas: (1) obtaining settings and configurations from operating systems or Web browsers (e.g., education-related service endpoints, e.g., learners' schools' servers), and (2) education-related bidirectional interprocess communication scenarios, e.g., with intelligent tutoring systems.

The current conceptual model utilizes nested frames:

  • Educational resource
    • Item collection and scheduling logic
      • HTML5-based item

Adding considered MIME types:

  • Educational resource (application/xhtml+xml, possibly from inside of an EPUB archive)
    • Item collection and scheduling logic (application/????+zip)
      • HTML5-based item (application/xhtml+xml from inside of the item collection)

Any thoughts on the conceptual model?

As considered, collections of homework, quiz, and exam items could be gathered together into zip archives along with those multimedia resources utilized by them (e.g., hypertext, fonts, stylesheets, scripts, metadata, images, animations, 3D models, audio clips). Should this be a way to go, a feature for Web browsers would be to recognize packages' MIME type to unzip and load up their contents for display.

Another detail would be to ensure that nested items could be maximized to fully utilize content display areas, and/or that fullscreen was possible, for those HTML5-based items in nested frames. It may be the case that these features are already possible with some scripting logic.

I would like a bit more time to brainstorm to your question about Web browser feature ideas. Presently, ideas include: (1) obtaining learners' operating system or Web browser settings with respect to education-related service endpoints (e.g., education-related service endpoints, e.g., learners' schools' servers), and (2) education-related bidirectional interprocess communication scenarios, e.g., with intelligent tutoring system applications.

@marcoscaceres
Copy link
Contributor

Any thoughts on the conceptual model?

Right, but what I was getting to was: what does the Web Platform not give you (as a primitive) to meet your requirements?

Conceptually, this can't be domain specific. The web generally only deals with generalized user cases, not, say "education use cases"... those may be covered generally, however.

(1) obtaining learners' operating system or Web browser settings with respect to education-related service endpoints (e.g., education-related service endpoints, e.g., learners' schools' servers), and

That would need to be weighed against user privacy. There is little reason to trust such institutions from a user's perspective - or for those institutions to trust themselves with such privacy sensitive data.

(2) education-related bidirectional interprocess communication scenarios, e.g., with intelligent tutoring system applications.

As with the first question: what can't you do over fetch, web sockets, or WebRTC or whatever?

@AdamSobieski
Copy link
Author

AdamSobieski commented May 21, 2024

That would need to be weighed against user privacy. There is little reason to trust such institutions from a user's perspective - or for those institutions to trust themselves with such privacy sensitive data.

Traditionally, to turn in homework assignments or to hand in completed quizzes or exams, learners have provided their teachers and schools with some educational data. Beyond completed sets of items, more modern, granular forms of educational data include, but are not limited to: timing data (how long did an item or each part of an item take a learner), items' user-interface event logs, and dialogue transcripts or event logs from intelligent tutoring systems.

Educational data, e.g., xAPI data, can be stored in learning record stores. Educational data can be processed and analyzed per educational data mining techniques.

According to Wikipedia, applications of educational data mining include: (1) the analysis and visualization of data, (2) providing feedback for supporting instructors, (3) recommendations for students, (4) predicting student performance, (5) student modeling, (6) detecting undesirable student behaviors, (7) grouping students, (8) social network analysis, (9) developing concept maps, (10) constructing courseware, and (11) planning and scheduling.

With respect to points 3, 4, and 5, there are open learner modeling and analytics to consider. In these approaches, learners can access, view, and be benefitted by their learner models, using this information to be able to better select and prioritize their practice activities.

On these topics, there are preschool, kindergarten, elementary school, middle school, secondary or high school, trade and vocational school, university, and recreational and lifelong learning scenarios to consider. The topics also span sectors. Beyond academia, there are also industry (e.g., business training), public sector (e.g., government personnel training), and military domains to consider.

Brainstorming to your point: there could be user permissions when learners first initialize their educational resources (e.g., websites, digital books, digital textbooks) and when they connect these to any remote services, including servers at their schools?

As with the first question: what can't you do over fetch, web sockets, or WebRTC or whatever?

I have also thought about WebRTC on these topics.

The following recent video shows multimodal language models seeing displayed items and learners performing on these items while simultaneously engaging in dialogue and answering questions: https://www.youtube.com/watch?v=IvXZCocyU_M .

Based on that video (which shows two desktop windows), I am thinking about client-side interoperability, e.g., interprocess communication, between Web browsers displaying educational resources (e.g., websites, digital books, digital textbooks) and intelligent tutoring systems to enable new features and capabilities.

@AdamSobieski
Copy link
Author

AdamSobieski commented May 21, 2024

what does the Web Platform not give you (as a primitive) to meet your requirements?

One thing that I'm hoping to discuss is enabling content authors and developers to be able to provide data and metadata for items within nested frames in Web browsers to external connected applications, e.g., intelligent tutoring systems, on clients.

Here are some more thoughts with respect to bidirectional interprocess and interapplication communication between Web browsers and other software applications.

Using the following variables:

var item_description = 'http://www.example.com/2024/#item-description';
var item_instructions = 'http://www.example.com/2024/#item-learner-instructions';
var item_objectives = 'http://www.example.com/2024/#item-educational-objectives';
var item_hints = 'http://www.example.com/2024/#item-hints';

and with something like:

window.exportData(item_description, 'text/plain', 'en', 'data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==');
window.exportData(item_instructions, 'text/plain', 'en', 'data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==');
window.exportData(item_objectives, 'text/plain', 'en', 'data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==');
window.exportData(item_hints, 'text/json', 'en', 'data:text/json;base64,SGVsbG8sIFdvcmxkIQ==');

and/or:

window.exportData(item_description, 'text/plain', 'en', my_js_callback_1);
window.exportData(item_instructions, 'text/plain', 'en', my_js_callback_2);
window.exportData(item_objectives, 'text/plain', 'en', my_js_callback_3);
window.exportData(item_hints, 'text/json', 'en', my_js_callback_4);

external processes, e.g., intelligent tutoring systems, would be able to connect and detect available exported data and functions (per semantic identifiers and other content-negotiation data) and could choose to retrieve or invoke these.

From the perspective of external software applications, implementation particulars for obtaining exported data from Web browsers' tabs would depend upon the operating system. There would also be a matter of enabling external software applications to detect changes in exported data or functions of interest to them, e.g., when items were completed by learners and new items were presented to them.

With respect to ensuring that multiple exported data could be synchronized, e.g., that all of the available exported data refer to the same item, something like the following could be considered:

window.exportOpen();
window.exportData(item_description, 'text/plain', 'en', 'data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==');
window.exportData(item_instructions, 'text/plain', 'en', 'data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==');
window.exportData(item_objectives, 'text/plain', 'en', 'data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==');
window.exportData(item_hints, 'text/json', 'en', 'data:text/json;base64,SGVsbG8sIFdvcmxkIQ==');
window.exportClose();

and/or:

window.exportOpen();
window.exportData(item_description, 'text/plain', 'en', my_js_callback_1);
window.exportData(item_instructions, 'text/plain', 'en', my_js_callback_2);
window.exportData(item_objectives, 'text/plain', 'en', my_js_callback_3);
window.exportData(item_hints, 'text/json', 'en', my_js_callback_4);
window.exportClose();

Below, the sketches are refactored to show possibilities:

window.interprocess.open();
window.interprocess.setExport(item_description, 'text/plain', 'en', 'data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==');
window.interprocess.setExport(item_instructions, 'text/plain', 'en', 'data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==');
window.interprocess.setExport(item_objectives, 'text/plain', 'en', 'data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==');
window.interprocess.setExport(item_hints, 'text/json', 'en', 'data:text/json;base64,SGVsbG8sIFdvcmxkIQ==');
window.interprocess.close();

and/or:

window.interprocess.open();
window.interprocess.setExport(item_description, 'text/plain', 'en', my_js_callback_1);
window.interprocess.setExport(item_instructions, 'text/plain', 'en', my_js_callback_2);
window.interprocess.setExport(item_objectives, 'text/plain', 'en', my_js_callback_3);
window.interprocess.setExport(item_hints, 'text/json', 'en', my_js_callback_4);
window.interprocess.close();

There are a variety of related technologies (e.g., fetch, cross-document messaging, web sockets, WebRTC, etc.) and interprocess and interapplication communication topics have been explored previously (e.g., Web Intents).

I have some additional ideas about JS API with respect to the other direction of communication, how external processes or applications might provide data properties and functions to scripts running in Web browsers.

Any thoughts on a JS API for unidirectional or bidirectional interprocess and interapplication communication?

@AdamSobieski
Copy link
Author

AdamSobieski commented May 27, 2024

A different approach for enabling interprocess and interapplication communication is presented below, one more directly inspired by the Web Platform.

In an operating-system-dependent and -mediated manner, other processes (e.g., intelligent tutoring systems) could connect to Web browser processes and provide them with interfaces with which to enable message passing.

With respect to browser-side JavaScript, something like the following would enable Web developers to get those processes connected to Web browser and then to get those window objects within the processes.

var w = window.interprocess.getProcess(/* ? */).getWindow(/* ? */);

w.postMessage(...);

As shown in the following example, some sort of app:// URL scheme could be of use for representing the origins of connected processes.

window.interprocess.addEventListener(
    "message",
    (event) => {
        if (event.origin !== "app://vendor/application/major.minor.build.revision/instanceNumber") return;
        // ...
    },
    false,
);

The following example shows how one could utilize a new Process interface to access connected processes' metadata. Connected processes' metadata could be provided by operating systems, using applications' digitally-signed manifests, or by other means.

window.interprocess.addEventListener(
    "message",
    (event) => {
        var process = window.interprocess.getProcess(event.origin);
        if(typeof process !== "undefined" && process !== null) {
            if(process.about...) {
                // ...
            }
        }
    },
    false,
);

Any thoughts on these more Web-Platform-inspired ideas for bidirectional interprocess and interapplication communication?

@marcoscaceres
Copy link
Contributor

Right, browsers would never allow that because the web app could directly attack the process.

I'd encourage you to take a look at how Web Authn, Payment Request API, or the new Digital Credentials API work... those shows specialized, secure, and privacy preserving approaches to talking to native applications while minimizing the risk of attacks from message passing.

@AdamSobieski
Copy link
Author

Ok, I will take a look at how Web Authn, Payment Request API, and new Digital Credentials API work towards enabling specialized, secure, and privacy-preserving bidirectional communication between intelligent tutoring systems and Web browsers presenting learners with educational exercises and activities.

In these regards, thus far considered:

  1. Web browsers could relay information about homework items to intelligent tutoring systems, e.g., items' descriptions, instructions, educational objectives, and hints.
  2. Intelligent tutoring systems could select and highlight elements within homework items' accompanying multimedia resources.
    1. In a dialogue about a physics homework item, for example, as an intelligent tutoring system said "gravitational force", a downward-pointing arrow in that item's illustration could be selected and visually highlighted, and, as the intelligent tutoring system said "normal force", an upward-pointing arrow in that illustration could be selected and visually highlighted.
    2. Virtual cameras could be moved through 3D-graphics visualizations to showcase specific elements in them, including those selected and visually highlighted elements, and animations could be navigated to frames depicting elements.
  3. Applications could stream data, e.g.,xAPI data, to any local or remote educational services, e.g., LRS's, pre-configured and selected by end-users.

@AdamSobieski
Copy link
Author

If I understand correctly, you are indicating that mutual authentication should be required for inter-application communication and that, from the browser-side, interfaces to cryptographic objects should be reused from other, existing APIs (e.g., Web Authn, Payment Request API, or Digital Credentials API).

That is, examples illustrating mTLS (e.g., for Node.js) involve providing file-system paths to important resources (key, cert, ca) and, if I understand correctly, you are indicating that interfaces from other APIs can and should be utilized instead.

Brainstorming:

  1. An end-user indicates for an intelligent tutoring system to connect to the active tab of a Web browser window.
  2. The intelligent tutoring system initiates a secure inter-application connection process, providing a certificate.
  3. The Web browser verifies this certificate.
  4. The Web browser requests the end-user's permission, displaying to them information from this verified certificate.
  5. If the end-user agrees, next the JavaScript software in the Web browser tab receives an event that it may recognize and handle, including to signal a readiness to connect to the connection-requesting software application.
  6. If the connection is accepted by the JavaScript software in the Web browser tab, the JavaScript software must provide a certificate and may request one from the end-user.
  7. If a certificate is requested from the end-user, the Web browser or the operating system presents them with a window to select one from a certificate store (e.g., a school-issued one).
  8. The intelligent tutoring system verifies the certificate provided to it.
  9. The intelligent tutoring system and Web browser mutually authenticate and communicate securely.

@marcoscaceres
Copy link
Contributor

marcoscaceres commented Jun 6, 2024

Right, but now change “intelligent tutoring system” to “application” to generalize the use case.

It sounds a little bit like #151 might have some overlap here. At the same time, enabling direct cross-process/cross-application communication is not something neither native or web apps really do (for security reasons). If you want to share data to an application, Web Share is a good option.

What, specifically, would the tutoring agent do? Like, complete this sequence:

  • payment wallets provide payment instruments/facilitate a one time payment.
  • credential wallet provides a one time digital credential (e.g. driver’s license)
  • web share shares a one time piece of data (URL, text, files)
  • Tutoring agent ??? a one time ???

@AdamSobieski
Copy link
Author

AdamSobieski commented Jun 7, 2024

That is a good point about generalizing to the broader use case of secure cross-process/cross-application communication. Yes, I also see some overlap with #151.

1. Web Browser ⟶ Tutoring Agent

Web browsers are desired to be able to share data about homework items with tutoring agents. Homework items' data could be used to populate portions of LLMs' prompts or to add to multimodal dialogue contexts.

In theory, relaying these data to tutoring agents could be accomplished using Web Share. Homework items might be displayed in a frame, however, and "transient activation" would, as I understand it, be needed per homework item.

That is, using Web Share would mean that end-users would have to click on a Share with Tutor button and then select the tutor application from a set of options each time that they wanted to send homework items' data to their tutoring agent.

2. Tutoring Agent ⟶ Web Browser

In theory, homework items' data sent to tutoring agents could include sets of described "landmarks" in items' accompanying multimedia resources. Capable tutoring agents could make use of these described "landmarks" to select, show, and highlight content in items' multimedia resources.

For example, a mathematics homework item might have an accompanying illustration depicting a right triangle with three sides and three angles. A tutoring agent would be able to select, show, and highlight any of these six described "landmarks" in the illustration.

For example, a physics homework item might have an accompanying illustration depicting two penguins, a rope, and a pulley. A tutoring agent would, similarly, be able to select, show, and highlight any of these four described "landmarks" in the illustration.

As considered, beyond selecting, showing, and highlighting elements in 2D pictures, these approaches would be desired to also work with stateful multimedia, animations, and 3D graphics visualizations.

3. What Would the Tutoring Agent Do?

To your question, a tutoring agent would receive data about homework items to help end-users and would be able to invoke items' interface functions, e.g., to engage in multimodal communication by intelligently selecting, showing, and highlighting described "landmarks" in items' accompanying multimedia resources.

@marcoscaceres
Copy link
Contributor

Sorry, please bear with me as this is new to me.

Are there examples of these tutoring agents in the wild? (e.g., browser extension) or couldn't the agent just run directly on the website? (e.g., via something like WebNN)

If it's just a software component, then the site could just directly interface with the model through JS.

What am I missing?

@AdamSobieski
Copy link
Author

AdamSobieski commented Jun 14, 2024

According to Wikipedia, examples of intelligent tutoring systems in the wild include: Algebra Tutor, SQL-Tutor, EER-Tutor, COLLECT-UML, StoichTutor, Mathematics Tutor, eTeacher, ZOSMAT, REALP, CIRCSIM-Tutor, Why2-Atlas, SmartTutor, AutoTutor, ActiveMath, ESC101-ITS, AdaptErrEx, GIFT, SHERLOCK, Cardiac Tutor, and CODES.

There are also tutoring-related browser extensions (e.g., here).

The recent Khanmigo video shows another example of the state of the art. In this video, one can see a learner and tutor agent discussing a right triangle. Generalizing, tutor agents could receive information about homework items (beyond "seeing" them) and they could also refer to, point to, and visually highlight, content in items' accompanying multimedia resources.

Also, brainstorming on these "tutor-item interoperation" topics, beyond visually selecting and highlighting "landmarks" in items' accompanying multimedia resources, AI tutor agents could populate items' text input fields with learner-specified content on their behalf. That is, agents, e.g., tutors, could serve as natural-language user interfaces to Web-based content, e.g., homework items.

For example, a learner might tell a tutor agent that an item's right triangle's hypotenuse was "5", and the tutor agent could enter that value into the appropriate text input field of that item. In theory, this would occur via the invocation of another natural-language-described function on the item's interface. Alternatively, tutor agents could verbally respond to learners that they were correct and indicate for them to then enter values into the appropriate text input field of the item.

To your question, a tutor agent and educational content, e.g., homework items, could be provided in one browser tab. However, these software components (e.g., digital textbooks, tutoring agents) might be from separate vendors. One could also put these software components into two or more interoperating browser tabs (e.g., in separate browser windows). Based on that video, however, I thought about interprocess / interapplication communication scenarios.

@AdamSobieski
Copy link
Author

AdamSobieski commented Sep 4, 2024

@marcoscaceres, on these topics of exploring approaches to enabling secure communications between websites and AI assistants, in this case intelligent tutors, thinking about some of the points that you raised here, I added some ideas to a new issue (#168) in its "Protocols" subsection.

The ideas involve that messages exchanged between websites and AI assistants could be semantic graphs, instead of text strings or byte arrays. Then, per ontologies and shapes constraints (SHACL), developers could define messages classes which could be used to validate message instances.

In addition to message classes, protocol definitions could specify rules (including time-based), valid sequences of message classes, valid state transitions, and so forth.

var channel = window.assistant.openChannelForProtocol('http://example.org/2024/protocol-123/#');

if(channel != null)
{
  channel.onmessage = (event) => {
      switch(event.class)
      {
        case 'http://example.org/2024/protocol-123/#messageClass1':
          messageHandler1(event.graph);
          break;
        case 'http://example.org/2024/protocol-123/#messageClass2':
          messageHandler2(event.graph);
          break;
        ...
      }
    };

  channel.postMessage(...);
}

Also, in addition to communicating software (e.g., websites, AI assistants) being able to validate received messages and ensure conformance with defined communication protocols, communication-mediating software (e.g., web browsers, operating systems) could, for non-encrypted communications, or for encrypted communications that they were a party to, be configured to use communication protocol definitions to ensure the conformance of communicating software with the defined communication protocols. That is, communication channel objects could ensure conformance with the communication protocols provided to them when they were created or initialized.

Showcasing the history of these topics, here are some relevant Wikipedia articles:

Research on process calculi began in earnest with Robin Milner's seminal work on the Calculus of Communicating Systems (CCS) during the period from 1973 to 1980. C.A.R. Hoare's Communicating Sequential Processes (CSP) first appeared in 1978, and was subsequently developed into a full-fledged process calculus during the early 1980s. There was much cross-fertilization of ideas between CCS and CSP as they developed. In 1982 Jan Bergstra and Jan Willem Klop began work on what came to be known as the Algebra of Communicating Processes (ACP), and introduced the term process algebra to describe their work. CCS, CSP, and ACP constitute the three major branches of the process calculi family: the majority of the other process calculi can trace their roots to one of these three calculi.

Today, gRPC is an example of a popular open-source (interface description) language and framework for generating software for remote procedure calls, a type of request-response communication protocol. From its documentation, here is an example of a service definition:

// The greeter service definition.
service Greeter {
  // Sends a greeting
  rpc SayHello (HelloRequest) returns (HelloReply) {}
}

// The request message containing the user's name.
message HelloRequest {
  string name = 1;
}

// The response message containing the greetings
message HelloReply {
  string message = 1;
}

I am finding these communication protocol, process calculus, and actor model topics to be interesting, e.g., in the context of multi-agent systems (e.g., the AutoGen framework).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants