mirror of
https://github.com/LukeHagar/volar-docs.git
synced 2025-12-06 04:22:01 +00:00
207 lines
7.5 KiB
Markdown
207 lines
7.5 KiB
Markdown
# Telemetry & Observability for Volar Language Servers
|
||
|
||
> [Docs Index](README.md) • [Repo README](../README.md) • [Performance Guide](performance-and-debugging.md) • [Error Handling](error-handling-and-resilience.md)
|
||
|
||
Robust observability helps teams detect regressions, diagnose user issues, and justify performance work. This guide covers every capability available to Volar-based servers: telemetry events, logging, progress reporting, health probes, and integration tips for popular editors.
|
||
|
||
## Observability Building Blocks
|
||
|
||
| Tool | Purpose | APIs |
|
||
| --- | --- | --- |
|
||
| Console logging | Developer-facing output in editor logs | `connection.console.{info,warn,error}` |
|
||
| Telemetry events | Structured analytics consumed by clients | `connection.telemetry.logEvent` |
|
||
| Work done progress | User-facing progress bars for long tasks | `connection.window.createWorkDoneProgress` |
|
||
| Diagnostics refresh | Force client to re-query diagnostics | `connection.languages.diagnostics.refresh()` |
|
||
| Custom notifications | Surface warnings/errors to users | `connection.window.showWarningMessage` |
|
||
|
||
## Logging Strategies
|
||
|
||
### 1. Namespaced Logs
|
||
|
||
Prefix logs with your server/component so users can filter easily:
|
||
|
||
```ts
|
||
const log = (level: 'info' | 'warn' | 'error', message: string, payload?: unknown) => {
|
||
const text = `[json-yaml:${level}] ${message}${payload ? ` ${JSON.stringify(payload)}` : ''}`;
|
||
connection.console[level](text);
|
||
};
|
||
```
|
||
|
||
- Use `info` for high-level events (project load, schema cache hits).
|
||
- Use `warn` for recoverable issues (fallback to default schema).
|
||
- Use `error` for critical failures—ideally accompanied by a user notification.
|
||
|
||
### 2. Log Levels via Settings
|
||
|
||
Expose a `logLevel` configuration (`off | error | warn | info | debug`) so users can control verbosity:
|
||
|
||
```ts
|
||
if (config.logLevel === 'debug') {
|
||
log('info', 'Schema fetch start', { uri });
|
||
}
|
||
```
|
||
|
||
### 3. Structured Payloads
|
||
|
||
Include relevant context in JSON to aid parsing:
|
||
|
||
```json
|
||
{
|
||
"timestamp": "...",
|
||
"event": "schema.fetch",
|
||
"uri": "http://schemas/foo.json",
|
||
"durationMs": 123,
|
||
"success": true
|
||
}
|
||
```
|
||
|
||
## Telemetry Events
|
||
|
||
Telemetry is optional and client-controlled; always check capabilities:
|
||
|
||
```ts
|
||
const telemetrySupported = server.initializeParams.capabilities?.experimental?.telemetry === true;
|
||
|
||
if (telemetrySupported) {
|
||
connection.telemetry.logEvent({
|
||
type: 'json-yaml.schemaFetch',
|
||
uri,
|
||
durationMs,
|
||
success,
|
||
});
|
||
}
|
||
```
|
||
|
||
### Event Design Principles
|
||
|
||
1. **No PII** – never include actual source code or user-specific paths unless hashed/anonymized.
|
||
2. **Actionable** – log events that can drive product decisions (schema fetch failures, TypeScript reloads, “workspace diagnostics took > 5s”).
|
||
3. **Stable schema** – define event names and payload shapes up front; changing them frequently breaks dashboards.
|
||
|
||
### Recommended Events
|
||
|
||
| Event | Trigger | Payload |
|
||
| --- | --- | --- |
|
||
| `schema.fetch` | Schema download (success + failure) | `{ uri, durationMs, success }` |
|
||
| `diagnostics.publish` | After `sendDiagnostics` | `{ uri, count, durationMs }` |
|
||
| `workspaceDiagnostics.run` | Workspace diagnostics completed | `{ documentCount, durationMs }` |
|
||
| `configuration.apply` | New config applied | `{ success, changedKeys }` |
|
||
| `takeOverMode.warning` | Detected conflicting TS server | `{ message }` |
|
||
|
||
### Sample Telemetry Wiring (VS Code Extension)
|
||
|
||
```ts
|
||
const telemetryReporter = new TelemetryReporter('volar-extension', version, aiKey);
|
||
|
||
connection.telemetry.logEvent = (event) => {
|
||
telemetryReporter.sendTelemetryEvent(event.type, sanitize(event));
|
||
};
|
||
|
||
function sanitize(event: AnyEvent) {
|
||
return {
|
||
...event,
|
||
uri: undefined, // avoid sending raw file paths
|
||
timestamp: new Date().toISOString(),
|
||
};
|
||
}
|
||
```
|
||
|
||
- Use Azure App Insights, Segment, or any analytics platform that matches your privacy requirements.
|
||
- Strip PII (file paths, code snippets) before forwarding events.
|
||
|
||
### Dashboard Example
|
||
|
||
Track key metrics with a dashboard (e.g., Grafana/Looker):
|
||
|
||
| Widget | Description |
|
||
| --- | --- |
|
||
| Schema Fetch Success % | `success` count / total by day; alert if < 95%. |
|
||
| Diagnostics Duration P95 | Box plot of `diagnostics.publish.durationMs`. |
|
||
| Workspace Diagnostics Runs | Number per workspace; spikes may indicate user confusion. |
|
||
| Top Error Messages | Grouped count of `takeOverMode.warning` and other errors. |
|
||
|
||
Use these dashboards to catch regressions (e.g., schema outages, slow diagnostics) before users report them.
|
||
|
||
## Work Done Progress & Notifications
|
||
|
||
For long-running operations (initial project load, large workspace diagnostics), display progress so users know work is happening.
|
||
|
||
```ts
|
||
async function withProgress(title: string, task: () => Promise<void>) {
|
||
const progress = await connection.window.createWorkDoneProgress();
|
||
progress.begin(title, 0);
|
||
try {
|
||
await task();
|
||
progress.report(100, 'Complete');
|
||
} finally {
|
||
progress.done();
|
||
}
|
||
}
|
||
```
|
||
|
||
Also use `connection.window.showWarningMessage` / `showErrorMessage` for issues requiring user intervention (missing schema files, invalid config).
|
||
|
||
## Health Checks & Metrics
|
||
|
||
Consider exposing internal health metrics for CI or headless environments:
|
||
|
||
```ts
|
||
connection.onRequest('volar/health', async () => ({
|
||
openDocuments: server.documents.all().length,
|
||
workspaceFolders: server.workspaceFolders.all.length,
|
||
lastDiagnosticsMs: metrics.lastDiagnosticsDuration,
|
||
}));
|
||
```
|
||
|
||
Integrate this request into smoke tests or CI to ensure the server responds with sane values.
|
||
|
||
## Editor-Specific Considerations
|
||
|
||
### VS Code
|
||
|
||
- Use `connection.window.showInformationMessage` sparingly; prefer progress notifications and logs.
|
||
- Provide commands that surface diagnostics/profiling info for debugging (e.g., “Volar: Show Server Logs”).
|
||
|
||
### Neovim
|
||
|
||
- Expose an RPC command to toggle verbose logging at runtime.
|
||
- Append logs to a file (e.g., `~/.cache/volar-nvim/volar.log`) so users can share them in bug reports.
|
||
|
||
### CLI
|
||
|
||
- Provide `--log-level` and `--log-file` flags.
|
||
- For headless usage, print JSON logs to stdout so downstream automation can parse them.
|
||
|
||
## Sampling & Rate Limiting
|
||
|
||
- For frequent events (per-keystroke completions), sample logs/telemetry to reduce noise:
|
||
|
||
```ts
|
||
if (Math.random() < 0.1) {
|
||
connection.telemetry.logEvent({ type: 'completion.run', durationMs });
|
||
}
|
||
```
|
||
|
||
- Throttle repeated warnings (e.g., schema fetch failures) to once per URI per minute to avoid spamming logs.
|
||
|
||
## Error Reporting
|
||
|
||
When an unexpected exception occurs:
|
||
|
||
1. Log the stack via `connection.console.error`.
|
||
2. Emit telemetry (if enabled) with a sanitized version of the error.
|
||
3. Notify the user when action is needed (“Failed to load schema; see Output ▸ Volar for details”).
|
||
|
||
## Observability Checklist
|
||
|
||
1. **Logging** – Namespaced helper with log levels; logs include key metadata.
|
||
2. **Telemetry** – Optional, PII-free events for actions that matter (schema fetch, diagnostics).
|
||
3. **Progress** – Work done progress notifications for long-running operations.
|
||
4. **Notifications** – Friendly user messages for actionable issues.
|
||
5. **Health endpoint** – Optional `volar/health` request for automated monitors.
|
||
6. **Configurable verbosity** – Users can toggle log level / telemetry participation.
|
||
7. **Sampling** – Applied to high-frequency events to avoid flooding logs.
|
||
8. **Documentation** – Tell users how to capture logs/telemetry, attach them to bug reports, and opt out if desired.
|
||
|
||
Instrument early and consistently—observability is far easier to add when your server is small than when you’re firefighting production issues.
|