-
Notifications
You must be signed in to change notification settings - Fork 351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bundling issues with tiktoken (Error: Missing tiktoken_bg.wasm) #1127
Comments
I am encountering this issue when trying to integrate llamaindex into my Obsidian plugin. The build output for the plugin is a bundled package.json (the relevant part): {
"type": "module",
"scripts": {
"dev": "node esbuild.config.mjs"
},
"dependencies": {
"llamaindex": "0.5.20"
}
} esbuild.config.mjs: import esbuild from "esbuild";
import process from "node:process";
import builtins from "builtin-modules";
const context = await esbuild.context({
entryPoints: { main: "src/main.ts" },
bundle: true,
platform: "node",
external: [
"obsidian",
"electron",
"sharp",
"onnxruntime-node",
"./xhr-sync-worker.js",
...builtins],
mainFields: ["browser", "module", "main"],
conditions: ["browser"],
format: "cjs",
target: "es2022",
logLevel: "info",
treeShaking: true,
outdir: "."
});
await context.rebuild();
process.exit(0); tsconfig.json: {
"compilerOptions": {
"baseUrl": "./src",
"target": "es2022",
"module": "ESNext",
"moduleResolution": "bundler",
"esModuleInterop": true,
"skipLibCheck": true,
"types": [
"node",
"jest"
],
"lib": [
"DOM",
"ES5",
"ES6",
"ES7",
"ES2021",
"ES2022"
]
},
"include": [
"**/*.ts"
]
} If I now use the following in my main.ts: import { HuggingFaceEmbedding, Settings } from 'llamaindex';
Settings.embedModel = new HuggingFaceEmbedding({
modelType: 'nomic-ai/nomic-embed-text-v1.5',
quantized: false
}); I get the error |
Just in case someone also faces the same issue. This is how I solved the issue My import path from "path";
import { fileURLToPath } from "url";
import _jiti from "jiti";
import { withLlamaIndex } from "@web/chatbot/next";
const jiti = _jiti(fileURLToPath(import.meta.url));
// Import env files to validate at build time. Use jiti so we can load .ts files in here.
jiti("./src/env");
const isStaticExport = "false";
// Get __dirname equivalent for ES modules
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
/**
* @type {import("next").NextConfig}
*/
const nextConfig = {
basePath: process.env.NEXT_PUBLIC_BASE_PATH,
serverRuntimeConfig: {
PROJECT_ROOT: __dirname,
},
env: {
BUILD_STATIC_EXPORT: isStaticExport,
},
// Trailing slashes must be disabled for Next Auth callback endpoint to work
// https://stackoverflow.com/a/78348528
trailingSlash: false,
modularizeImports: {
"@mui/icons-material": {
transform: "@mui/icons-material/{{member}}",
},
"@mui/material": {
transform: "@mui/material/{{member}}",
},
"@mui/lab": {
transform: "@mui/lab/{{member}}",
},
},
webpack(config) {
config.module.rules.push({
test: /\.svg$/,
use: ["@svgr/webpack"],
});
// To allow chatbot to work
// Extracted from: https:/neondatabase/examples/blob/main/ai/llamaindex/rag-nextjs/next.config.mjs
config.resolve.alias = {
...config.resolve.alias,
sharp$: false,
"onnxruntime-node$": false,
};
// From: https:/dqbd/tiktoken?tab=readme-ov-file#nextjs
config.experiments = {
asyncWebAssembly: true,
layers: true,
};
return config;
},
...(isStaticExport === "true" && {
output: "export",
}),
experimental: {
outputFileTracingIncludes: {
"/*": ["./cache/**/*"],
"/api/**/*": ["./node_modules/**/*.wasm"],
},
serverComponentsExternalPackages: ["tiktoken", "onnxruntime-node"],
},
/** Enables hot reloading for local packages without a build step */
transpilePackages: [
"@web/api",
"@web/auth",
"@web/db",
"@web/ui",
"@web/validators",
"@web/services",
"@web/utils",
"@web/logger",
"@web/certs",
"@web/chatbot",
],
/** We already do linting and typechecking as separate tasks in CI */
eslint: { ignoreDuringBuilds: true },
typescript: { ignoreBuildErrors: true },
};
const withLlamaIndexConfig = withLlamaIndex(nextConfig);
export default withLlamaIndexConfig; In my case everything related to Here's how my package.json at @web/chatbot looks like: {
"name": "@web/chatbot",
"private": true,
"version": "0.1.0",
"type": "module",
"exports": {
".": "./src/index.ts",
"./next": "./src/with-lama-index.mjs"
},
"license": "MIT",
"scripts": {
"clean": "rm -rf .turbo node_modules",
"format": "prettier --check . --ignore-path ../../.gitignore --ignore-path ../../.prettierignore",
"lint": "eslint .",
"typecheck": "tsc --emitDeclarationOnly"
},
"devDependencies": {
"@web/eslint-config": "workspace:*",
"@web/prettier-config": "workspace:*",
"@web/tsconfig": "workspace:*",
"@web/utils": "workspace:*",
"eslint": "catalog:",
"prettier": "catalog:",
"typescript": "catalog:"
},
"prettier": "@web/prettier-config",
"dependencies": {
"@web/logger": "workspace:*",
"@t3-oss/env-nextjs": "catalog:",
"js-tiktoken": "^1.0.14",
"llamaindex": "catalog:",
"pg": "^8.13.0",
"tiktoken": "^1.0.16"
}
} For reference: The For more context check #1226 |
I think i'm a victim of this too! My lamba logs:
My stack is;
I think this is the last workaround and things seem to be ok now. Most recent issue was on lambda execute, i'd get the above error, long before llamaindex was being used. I'm no bundling expert, but i believe one equivalent workaround of the next js workaround that i think is working for me is to install vs bundle tiktoken via an esbuild option, which i have no idea how it will slow things down. const settings : cdk.aws_lambda_nodejs.NodejsFunctionProps = {
handler: "handler",
runtime: props.serviceConfig.nodeRuntime,
memorySize: 256,
tracing: Tracing.ACTIVE,
bundling: {
logLevel: LogLevel.INFO,
nodeModules: ["tiktoken"], // main workaround
minify: false, //branchIsMain(), tried both
tsconfig: "../backend/tsconfig.json",
sourceMap: !branchIsMain(),
metafile: !branchIsMain(),
// TODO LlamaIndex hack - see https:/evanw/esbuild/issues/1051
// this might appear as the following error during synth: No loader is configured for ".node" files: node_modules/.pnpm/onnxruntime-node
loader: {
// i needed to add this for another llamaindex dep issue
".node": "file",
// i had this briefly, but now that its installed i don't need it anymore
// ".wasm": "file",
},
},
}; Would love some wisdom on this workaround, and if there are plans to fix this. I'd also love some insight about if i really should need to install some of these extra libs in my package.json, as i obviously went ahead and did that but it often didn't have an effect (i'll assume because they were all shaken out). Thanks for tracking this! Update - the workaround at least for now likely won't work... max artifact size. Any other suggestions would be greatly appreciated! |
I am opening this ticket to gather all issues related to bundling the WASM from https:/dqbd/tiktoken:
If you encounter this issue, please post your setup and configuration here.
The text was updated successfully, but these errors were encountered: