langfuse

Langfuse Python SDK
Installation
The SDK was rewritten in v3 and released in June 2025. Refer to the v3 migration guide for instructions on updating your code.
pip install langfuse
Docs
Please see our docs for detailed information on this SDK.
1""".. include:: ../README.md""" 2 3from langfuse.experiment import Evaluation 4 5from ._client import client as _client_module 6from ._client.attributes import LangfuseOtelSpanAttributes 7from ._client.constants import ObservationTypeLiteral 8from ._client.get_client import get_client 9from ._client.observe import observe 10from ._client.propagation import propagate_attributes 11from ._client.span import ( 12 LangfuseAgent, 13 LangfuseChain, 14 LangfuseEmbedding, 15 LangfuseEvaluator, 16 LangfuseEvent, 17 LangfuseGeneration, 18 LangfuseGuardrail, 19 LangfuseRetriever, 20 LangfuseSpan, 21 LangfuseTool, 22) 23 24Langfuse = _client_module.Langfuse 25 26__all__ = [ 27 "Langfuse", 28 "get_client", 29 "observe", 30 "propagate_attributes", 31 "ObservationTypeLiteral", 32 "LangfuseSpan", 33 "LangfuseGeneration", 34 "LangfuseEvent", 35 "LangfuseOtelSpanAttributes", 36 "LangfuseAgent", 37 "LangfuseTool", 38 "LangfuseChain", 39 "LangfuseEmbedding", 40 "LangfuseEvaluator", 41 "LangfuseRetriever", 42 "LangfuseGuardrail", 43 "Evaluation", 44 "experiment", 45 "api", 46]
121class Langfuse: 122 """Main client for Langfuse tracing and platform features. 123 124 This class provides an interface for creating and managing traces, spans, 125 and generations in Langfuse as well as interacting with the Langfuse API. 126 127 The client features a thread-safe singleton pattern for each unique public API key, 128 ensuring consistent trace context propagation across your application. It implements 129 efficient batching of spans with configurable flush settings and includes background 130 thread management for media uploads and score ingestion. 131 132 Configuration is flexible through either direct parameters or environment variables, 133 with graceful fallbacks and runtime configuration updates. 134 135 Attributes: 136 api: Synchronous API client for Langfuse backend communication 137 async_api: Asynchronous API client for Langfuse backend communication 138 _otel_tracer: Internal LangfuseTracer instance managing OpenTelemetry components 139 140 Parameters: 141 public_key (Optional[str]): Your Langfuse public API key. Can also be set via LANGFUSE_PUBLIC_KEY environment variable. 142 secret_key (Optional[str]): Your Langfuse secret API key. Can also be set via LANGFUSE_SECRET_KEY environment variable. 143 base_url (Optional[str]): The Langfuse API base URL. Defaults to "https://cloud.langfuse.com". Can also be set via LANGFUSE_BASE_URL environment variable. 144 host (Optional[str]): Deprecated. Use base_url instead. The Langfuse API host URL. Defaults to "https://cloud.langfuse.com". 145 timeout (Optional[int]): Timeout in seconds for API requests. Defaults to 5 seconds. 146 httpx_client (Optional[httpx.Client]): Custom httpx client for making non-tracing HTTP requests. If not provided, a default client will be created. 147 debug (bool): Enable debug logging. Defaults to False. Can also be set via LANGFUSE_DEBUG environment variable. 148 tracing_enabled (Optional[bool]): Enable or disable tracing. Defaults to True. Can also be set via LANGFUSE_TRACING_ENABLED environment variable. 149 flush_at (Optional[int]): Number of spans to batch before sending to the API. Defaults to 512. Can also be set via LANGFUSE_FLUSH_AT environment variable. 150 flush_interval (Optional[float]): Time in seconds between batch flushes. Defaults to 5 seconds. Can also be set via LANGFUSE_FLUSH_INTERVAL environment variable. 151 environment (Optional[str]): Environment name for tracing. Default is 'default'. Can also be set via LANGFUSE_TRACING_ENVIRONMENT environment variable. Can be any lowercase alphanumeric string with hyphens and underscores that does not start with 'langfuse'. 152 release (Optional[str]): Release version/hash of your application. Used for grouping analytics by release. 153 media_upload_thread_count (Optional[int]): Number of background threads for handling media uploads. Defaults to 1. Can also be set via LANGFUSE_MEDIA_UPLOAD_THREAD_COUNT environment variable. 154 sample_rate (Optional[float]): Sampling rate for traces (0.0 to 1.0). Defaults to 1.0 (100% of traces are sampled). Can also be set via LANGFUSE_SAMPLE_RATE environment variable. 155 mask (Optional[MaskFunction]): Function to mask sensitive data in traces before sending to the API. 156 blocked_instrumentation_scopes (Optional[List[str]]): List of instrumentation scope names to block from being exported to Langfuse. Spans from these scopes will be filtered out before being sent to the API. Useful for filtering out spans from specific libraries or frameworks. For exported spans, you can see the instrumentation scope name in the span metadata in Langfuse (`metadata.scope.name`) 157 additional_headers (Optional[Dict[str, str]]): Additional headers to include in all API requests and OTLPSpanExporter requests. These headers will be merged with default headers. Note: If httpx_client is provided, additional_headers must be set directly on your custom httpx_client as well. 158 tracer_provider(Optional[TracerProvider]): OpenTelemetry TracerProvider to use for Langfuse. This can be useful to set to have disconnected tracing between Langfuse and other OpenTelemetry-span emitting libraries. Note: To track active spans, the context is still shared between TracerProviders. This may lead to broken trace trees. 159 160 Example: 161 ```python 162 from langfuse.otel import Langfuse 163 164 # Initialize the client (reads from env vars if not provided) 165 langfuse = Langfuse( 166 public_key="your-public-key", 167 secret_key="your-secret-key", 168 host="https://cloud.langfuse.com", # Optional, default shown 169 ) 170 171 # Create a trace span 172 with langfuse.start_as_current_span(name="process-query") as span: 173 # Your application code here 174 175 # Create a nested generation span for an LLM call 176 with span.start_as_current_generation( 177 name="generate-response", 178 model="gpt-4", 179 input={"query": "Tell me about AI"}, 180 model_parameters={"temperature": 0.7, "max_tokens": 500} 181 ) as generation: 182 # Generate response here 183 response = "AI is a field of computer science..." 184 185 generation.update( 186 output=response, 187 usage_details={"prompt_tokens": 10, "completion_tokens": 50}, 188 cost_details={"total_cost": 0.0023} 189 ) 190 191 # Score the generation (supports NUMERIC, BOOLEAN, CATEGORICAL) 192 generation.score(name="relevance", value=0.95, data_type="NUMERIC") 193 ``` 194 """ 195 196 _resources: Optional[LangfuseResourceManager] = None 197 _mask: Optional[MaskFunction] = None 198 _otel_tracer: otel_trace_api.Tracer 199 200 def __init__( 201 self, 202 *, 203 public_key: Optional[str] = None, 204 secret_key: Optional[str] = None, 205 base_url: Optional[str] = None, 206 host: Optional[str] = None, 207 timeout: Optional[int] = None, 208 httpx_client: Optional[httpx.Client] = None, 209 debug: bool = False, 210 tracing_enabled: Optional[bool] = True, 211 flush_at: Optional[int] = None, 212 flush_interval: Optional[float] = None, 213 environment: Optional[str] = None, 214 release: Optional[str] = None, 215 media_upload_thread_count: Optional[int] = None, 216 sample_rate: Optional[float] = None, 217 mask: Optional[MaskFunction] = None, 218 blocked_instrumentation_scopes: Optional[List[str]] = None, 219 additional_headers: Optional[Dict[str, str]] = None, 220 tracer_provider: Optional[TracerProvider] = None, 221 ): 222 self._base_url = ( 223 base_url 224 or os.environ.get(LANGFUSE_BASE_URL) 225 or host 226 or os.environ.get(LANGFUSE_HOST, "https://cloud.langfuse.com") 227 ) 228 self._environment = environment or cast( 229 str, os.environ.get(LANGFUSE_TRACING_ENVIRONMENT) 230 ) 231 self._project_id: Optional[str] = None 232 sample_rate = sample_rate or float(os.environ.get(LANGFUSE_SAMPLE_RATE, 1.0)) 233 if not 0.0 <= sample_rate <= 1.0: 234 raise ValueError( 235 f"Sample rate must be between 0.0 and 1.0, got {sample_rate}" 236 ) 237 238 timeout = timeout or int(os.environ.get(LANGFUSE_TIMEOUT, 5)) 239 240 self._tracing_enabled = ( 241 tracing_enabled 242 and os.environ.get(LANGFUSE_TRACING_ENABLED, "true").lower() != "false" 243 ) 244 if not self._tracing_enabled: 245 langfuse_logger.info( 246 "Configuration: Langfuse tracing is explicitly disabled. No data will be sent to the Langfuse API." 247 ) 248 249 debug = ( 250 debug if debug else (os.getenv(LANGFUSE_DEBUG, "false").lower() == "true") 251 ) 252 if debug: 253 logging.basicConfig( 254 format="%(asctime)s - %(name)s - %(levelname)s - %(message)s" 255 ) 256 langfuse_logger.setLevel(logging.DEBUG) 257 258 public_key = public_key or os.environ.get(LANGFUSE_PUBLIC_KEY) 259 if public_key is None: 260 langfuse_logger.warning( 261 "Authentication error: Langfuse client initialized without public_key. Client will be disabled. " 262 "Provide a public_key parameter or set LANGFUSE_PUBLIC_KEY environment variable. " 263 ) 264 self._otel_tracer = otel_trace_api.NoOpTracer() 265 return 266 267 secret_key = secret_key or os.environ.get(LANGFUSE_SECRET_KEY) 268 if secret_key is None: 269 langfuse_logger.warning( 270 "Authentication error: Langfuse client initialized without secret_key. Client will be disabled. " 271 "Provide a secret_key parameter or set LANGFUSE_SECRET_KEY environment variable. " 272 ) 273 self._otel_tracer = otel_trace_api.NoOpTracer() 274 return 275 276 if os.environ.get("OTEL_SDK_DISABLED", "false").lower() == "true": 277 langfuse_logger.warning( 278 "OTEL_SDK_DISABLED is set. Langfuse tracing will be disabled and no traces will appear in the UI." 279 ) 280 281 # Initialize api and tracer if requirements are met 282 self._resources = LangfuseResourceManager( 283 public_key=public_key, 284 secret_key=secret_key, 285 base_url=self._base_url, 286 timeout=timeout, 287 environment=self._environment, 288 release=release, 289 flush_at=flush_at, 290 flush_interval=flush_interval, 291 httpx_client=httpx_client, 292 media_upload_thread_count=media_upload_thread_count, 293 sample_rate=sample_rate, 294 mask=mask, 295 tracing_enabled=self._tracing_enabled, 296 blocked_instrumentation_scopes=blocked_instrumentation_scopes, 297 additional_headers=additional_headers, 298 tracer_provider=tracer_provider, 299 ) 300 self._mask = self._resources.mask 301 302 self._otel_tracer = ( 303 self._resources.tracer 304 if self._tracing_enabled and self._resources.tracer is not None 305 else otel_trace_api.NoOpTracer() 306 ) 307 self.api = self._resources.api 308 self.async_api = self._resources.async_api 309 310 def start_span( 311 self, 312 *, 313 trace_context: Optional[TraceContext] = None, 314 name: str, 315 input: Optional[Any] = None, 316 output: Optional[Any] = None, 317 metadata: Optional[Any] = None, 318 version: Optional[str] = None, 319 level: Optional[SpanLevel] = None, 320 status_message: Optional[str] = None, 321 ) -> LangfuseSpan: 322 """Create a new span for tracing a unit of work. 323 324 This method creates a new span but does not set it as the current span in the 325 context. To create and use a span within a context, use start_as_current_span(). 326 327 The created span will be the child of the current span in the context. 328 329 Args: 330 trace_context: Optional context for connecting to an existing trace 331 name: Name of the span (e.g., function or operation name) 332 input: Input data for the operation (can be any JSON-serializable object) 333 output: Output data from the operation (can be any JSON-serializable object) 334 metadata: Additional metadata to associate with the span 335 version: Version identifier for the code or component 336 level: Importance level of the span (info, warning, error) 337 status_message: Optional status message for the span 338 339 Returns: 340 A LangfuseSpan object that must be ended with .end() when the operation completes 341 342 Example: 343 ```python 344 span = langfuse.start_span(name="process-data") 345 try: 346 # Do work 347 span.update(output="result") 348 finally: 349 span.end() 350 ``` 351 """ 352 return self.start_observation( 353 trace_context=trace_context, 354 name=name, 355 as_type="span", 356 input=input, 357 output=output, 358 metadata=metadata, 359 version=version, 360 level=level, 361 status_message=status_message, 362 ) 363 364 def start_as_current_span( 365 self, 366 *, 367 trace_context: Optional[TraceContext] = None, 368 name: str, 369 input: Optional[Any] = None, 370 output: Optional[Any] = None, 371 metadata: Optional[Any] = None, 372 version: Optional[str] = None, 373 level: Optional[SpanLevel] = None, 374 status_message: Optional[str] = None, 375 end_on_exit: Optional[bool] = None, 376 ) -> _AgnosticContextManager[LangfuseSpan]: 377 """Create a new span and set it as the current span in a context manager. 378 379 This method creates a new span and sets it as the current span within a context 380 manager. Use this method with a 'with' statement to automatically handle span 381 lifecycle within a code block. 382 383 The created span will be the child of the current span in the context. 384 385 Args: 386 trace_context: Optional context for connecting to an existing trace 387 name: Name of the span (e.g., function or operation name) 388 input: Input data for the operation (can be any JSON-serializable object) 389 output: Output data from the operation (can be any JSON-serializable object) 390 metadata: Additional metadata to associate with the span 391 version: Version identifier for the code or component 392 level: Importance level of the span (info, warning, error) 393 status_message: Optional status message for the span 394 end_on_exit (default: True): Whether to end the span automatically when leaving the context manager. If False, the span must be manually ended to avoid memory leaks. 395 396 Returns: 397 A context manager that yields a LangfuseSpan 398 399 Example: 400 ```python 401 with langfuse.start_as_current_span(name="process-query") as span: 402 # Do work 403 result = process_data() 404 span.update(output=result) 405 406 # Create a child span automatically 407 with span.start_as_current_span(name="sub-operation") as child_span: 408 # Do sub-operation work 409 child_span.update(output="sub-result") 410 ``` 411 """ 412 return self.start_as_current_observation( 413 trace_context=trace_context, 414 name=name, 415 as_type="span", 416 input=input, 417 output=output, 418 metadata=metadata, 419 version=version, 420 level=level, 421 status_message=status_message, 422 end_on_exit=end_on_exit, 423 ) 424 425 @overload 426 def start_observation( 427 self, 428 *, 429 trace_context: Optional[TraceContext] = None, 430 name: str, 431 as_type: Literal["generation"], 432 input: Optional[Any] = None, 433 output: Optional[Any] = None, 434 metadata: Optional[Any] = None, 435 version: Optional[str] = None, 436 level: Optional[SpanLevel] = None, 437 status_message: Optional[str] = None, 438 completion_start_time: Optional[datetime] = None, 439 model: Optional[str] = None, 440 model_parameters: Optional[Dict[str, MapValue]] = None, 441 usage_details: Optional[Dict[str, int]] = None, 442 cost_details: Optional[Dict[str, float]] = None, 443 prompt: Optional[PromptClient] = None, 444 ) -> LangfuseGeneration: ... 445 446 @overload 447 def start_observation( 448 self, 449 *, 450 trace_context: Optional[TraceContext] = None, 451 name: str, 452 as_type: Literal["span"] = "span", 453 input: Optional[Any] = None, 454 output: Optional[Any] = None, 455 metadata: Optional[Any] = None, 456 version: Optional[str] = None, 457 level: Optional[SpanLevel] = None, 458 status_message: Optional[str] = None, 459 ) -> LangfuseSpan: ... 460 461 @overload 462 def start_observation( 463 self, 464 *, 465 trace_context: Optional[TraceContext] = None, 466 name: str, 467 as_type: Literal["agent"], 468 input: Optional[Any] = None, 469 output: Optional[Any] = None, 470 metadata: Optional[Any] = None, 471 version: Optional[str] = None, 472 level: Optional[SpanLevel] = None, 473 status_message: Optional[str] = None, 474 ) -> LangfuseAgent: ... 475 476 @overload 477 def start_observation( 478 self, 479 *, 480 trace_context: Optional[TraceContext] = None, 481 name: str, 482 as_type: Literal["tool"], 483 input: Optional[Any] = None, 484 output: Optional[Any] = None, 485 metadata: Optional[Any] = None, 486 version: Optional[str] = None, 487 level: Optional[SpanLevel] = None, 488 status_message: Optional[str] = None, 489 ) -> LangfuseTool: ... 490 491 @overload 492 def start_observation( 493 self, 494 *, 495 trace_context: Optional[TraceContext] = None, 496 name: str, 497 as_type: Literal["chain"], 498 input: Optional[Any] = None, 499 output: Optional[Any] = None, 500 metadata: Optional[Any] = None, 501 version: Optional[str] = None, 502 level: Optional[SpanLevel] = None, 503 status_message: Optional[str] = None, 504 ) -> LangfuseChain: ... 505 506 @overload 507 def start_observation( 508 self, 509 *, 510 trace_context: Optional[TraceContext] = None, 511 name: str, 512 as_type: Literal["retriever"], 513 input: Optional[Any] = None, 514 output: Optional[Any] = None, 515 metadata: Optional[Any] = None, 516 version: Optional[str] = None, 517 level: Optional[SpanLevel] = None, 518 status_message: Optional[str] = None, 519 ) -> LangfuseRetriever: ... 520 521 @overload 522 def start_observation( 523 self, 524 *, 525 trace_context: Optional[TraceContext] = None, 526 name: str, 527 as_type: Literal["evaluator"], 528 input: Optional[Any] = None, 529 output: Optional[Any] = None, 530 metadata: Optional[Any] = None, 531 version: Optional[str] = None, 532 level: Optional[SpanLevel] = None, 533 status_message: Optional[str] = None, 534 ) -> LangfuseEvaluator: ... 535 536 @overload 537 def start_observation( 538 self, 539 *, 540 trace_context: Optional[TraceContext] = None, 541 name: str, 542 as_type: Literal["embedding"], 543 input: Optional[Any] = None, 544 output: Optional[Any] = None, 545 metadata: Optional[Any] = None, 546 version: Optional[str] = None, 547 level: Optional[SpanLevel] = None, 548 status_message: Optional[str] = None, 549 completion_start_time: Optional[datetime] = None, 550 model: Optional[str] = None, 551 model_parameters: Optional[Dict[str, MapValue]] = None, 552 usage_details: Optional[Dict[str, int]] = None, 553 cost_details: Optional[Dict[str, float]] = None, 554 prompt: Optional[PromptClient] = None, 555 ) -> LangfuseEmbedding: ... 556 557 @overload 558 def start_observation( 559 self, 560 *, 561 trace_context: Optional[TraceContext] = None, 562 name: str, 563 as_type: Literal["guardrail"], 564 input: Optional[Any] = None, 565 output: Optional[Any] = None, 566 metadata: Optional[Any] = None, 567 version: Optional[str] = None, 568 level: Optional[SpanLevel] = None, 569 status_message: Optional[str] = None, 570 ) -> LangfuseGuardrail: ... 571 572 def start_observation( 573 self, 574 *, 575 trace_context: Optional[TraceContext] = None, 576 name: str, 577 as_type: ObservationTypeLiteralNoEvent = "span", 578 input: Optional[Any] = None, 579 output: Optional[Any] = None, 580 metadata: Optional[Any] = None, 581 version: Optional[str] = None, 582 level: Optional[SpanLevel] = None, 583 status_message: Optional[str] = None, 584 completion_start_time: Optional[datetime] = None, 585 model: Optional[str] = None, 586 model_parameters: Optional[Dict[str, MapValue]] = None, 587 usage_details: Optional[Dict[str, int]] = None, 588 cost_details: Optional[Dict[str, float]] = None, 589 prompt: Optional[PromptClient] = None, 590 ) -> Union[ 591 LangfuseSpan, 592 LangfuseGeneration, 593 LangfuseAgent, 594 LangfuseTool, 595 LangfuseChain, 596 LangfuseRetriever, 597 LangfuseEvaluator, 598 LangfuseEmbedding, 599 LangfuseGuardrail, 600 ]: 601 """Create a new observation of the specified type. 602 603 This method creates a new observation but does not set it as the current span in the 604 context. To create and use an observation within a context, use start_as_current_observation(). 605 606 Args: 607 trace_context: Optional context for connecting to an existing trace 608 name: Name of the observation 609 as_type: Type of observation to create (defaults to "span") 610 input: Input data for the operation 611 output: Output data from the operation 612 metadata: Additional metadata to associate with the observation 613 version: Version identifier for the code or component 614 level: Importance level of the observation 615 status_message: Optional status message for the observation 616 completion_start_time: When the model started generating (for generation types) 617 model: Name/identifier of the AI model used (for generation types) 618 model_parameters: Parameters used for the model (for generation types) 619 usage_details: Token usage information (for generation types) 620 cost_details: Cost information (for generation types) 621 prompt: Associated prompt template (for generation types) 622 623 Returns: 624 An observation object of the appropriate type that must be ended with .end() 625 """ 626 if trace_context: 627 trace_id = trace_context.get("trace_id", None) 628 parent_span_id = trace_context.get("parent_span_id", None) 629 630 if trace_id: 631 remote_parent_span = self._create_remote_parent_span( 632 trace_id=trace_id, parent_span_id=parent_span_id 633 ) 634 635 with otel_trace_api.use_span( 636 cast(otel_trace_api.Span, remote_parent_span) 637 ): 638 otel_span = self._otel_tracer.start_span(name=name) 639 otel_span.set_attribute(LangfuseOtelSpanAttributes.AS_ROOT, True) 640 641 return self._create_observation_from_otel_span( 642 otel_span=otel_span, 643 as_type=as_type, 644 input=input, 645 output=output, 646 metadata=metadata, 647 version=version, 648 level=level, 649 status_message=status_message, 650 completion_start_time=completion_start_time, 651 model=model, 652 model_parameters=model_parameters, 653 usage_details=usage_details, 654 cost_details=cost_details, 655 prompt=prompt, 656 ) 657 658 otel_span = self._otel_tracer.start_span(name=name) 659 660 return self._create_observation_from_otel_span( 661 otel_span=otel_span, 662 as_type=as_type, 663 input=input, 664 output=output, 665 metadata=metadata, 666 version=version, 667 level=level, 668 status_message=status_message, 669 completion_start_time=completion_start_time, 670 model=model, 671 model_parameters=model_parameters, 672 usage_details=usage_details, 673 cost_details=cost_details, 674 prompt=prompt, 675 ) 676 677 def _create_observation_from_otel_span( 678 self, 679 *, 680 otel_span: otel_trace_api.Span, 681 as_type: ObservationTypeLiteralNoEvent, 682 input: Optional[Any] = None, 683 output: Optional[Any] = None, 684 metadata: Optional[Any] = None, 685 version: Optional[str] = None, 686 level: Optional[SpanLevel] = None, 687 status_message: Optional[str] = None, 688 completion_start_time: Optional[datetime] = None, 689 model: Optional[str] = None, 690 model_parameters: Optional[Dict[str, MapValue]] = None, 691 usage_details: Optional[Dict[str, int]] = None, 692 cost_details: Optional[Dict[str, float]] = None, 693 prompt: Optional[PromptClient] = None, 694 ) -> Union[ 695 LangfuseSpan, 696 LangfuseGeneration, 697 LangfuseAgent, 698 LangfuseTool, 699 LangfuseChain, 700 LangfuseRetriever, 701 LangfuseEvaluator, 702 LangfuseEmbedding, 703 LangfuseGuardrail, 704 ]: 705 """Create the appropriate observation type from an OTEL span.""" 706 if as_type in get_observation_types_list(ObservationTypeGenerationLike): 707 observation_class = self._get_span_class(as_type) 708 # Type ignore to prevent overloads of internal _get_span_class function, 709 # issue is that LangfuseEvent could be returned and that classes have diff. args 710 return observation_class( # type: ignore[return-value,call-arg] 711 otel_span=otel_span, 712 langfuse_client=self, 713 environment=self._environment, 714 input=input, 715 output=output, 716 metadata=metadata, 717 version=version, 718 level=level, 719 status_message=status_message, 720 completion_start_time=completion_start_time, 721 model=model, 722 model_parameters=model_parameters, 723 usage_details=usage_details, 724 cost_details=cost_details, 725 prompt=prompt, 726 ) 727 else: 728 # For other types (e.g. span, guardrail), create appropriate class without generation properties 729 observation_class = self._get_span_class(as_type) 730 # Type ignore to prevent overloads of internal _get_span_class function, 731 # issue is that LangfuseEvent could be returned and that classes have diff. args 732 return observation_class( # type: ignore[return-value,call-arg] 733 otel_span=otel_span, 734 langfuse_client=self, 735 environment=self._environment, 736 input=input, 737 output=output, 738 metadata=metadata, 739 version=version, 740 level=level, 741 status_message=status_message, 742 ) 743 # span._observation_type = as_type 744 # span._otel_span.set_attribute("langfuse.observation.type", as_type) 745 # return span 746 747 def start_generation( 748 self, 749 *, 750 trace_context: Optional[TraceContext] = None, 751 name: str, 752 input: Optional[Any] = None, 753 output: Optional[Any] = None, 754 metadata: Optional[Any] = None, 755 version: Optional[str] = None, 756 level: Optional[SpanLevel] = None, 757 status_message: Optional[str] = None, 758 completion_start_time: Optional[datetime] = None, 759 model: Optional[str] = None, 760 model_parameters: Optional[Dict[str, MapValue]] = None, 761 usage_details: Optional[Dict[str, int]] = None, 762 cost_details: Optional[Dict[str, float]] = None, 763 prompt: Optional[PromptClient] = None, 764 ) -> LangfuseGeneration: 765 """Create a new generation span for model generations. 766 767 DEPRECATED: This method is deprecated and will be removed in a future version. 768 Use start_observation(as_type='generation') instead. 769 770 This method creates a specialized span for tracking model generations. 771 It includes additional fields specific to model generations such as model name, 772 token usage, and cost details. 773 774 The created generation span will be the child of the current span in the context. 775 776 Args: 777 trace_context: Optional context for connecting to an existing trace 778 name: Name of the generation operation 779 input: Input data for the model (e.g., prompts) 780 output: Output from the model (e.g., completions) 781 metadata: Additional metadata to associate with the generation 782 version: Version identifier for the model or component 783 level: Importance level of the generation (info, warning, error) 784 status_message: Optional status message for the generation 785 completion_start_time: When the model started generating the response 786 model: Name/identifier of the AI model used (e.g., "gpt-4") 787 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 788 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 789 cost_details: Cost information for the model call 790 prompt: Associated prompt template from Langfuse prompt management 791 792 Returns: 793 A LangfuseGeneration object that must be ended with .end() when complete 794 795 Example: 796 ```python 797 generation = langfuse.start_generation( 798 name="answer-generation", 799 model="gpt-4", 800 input={"prompt": "Explain quantum computing"}, 801 model_parameters={"temperature": 0.7} 802 ) 803 try: 804 # Call model API 805 response = llm.generate(...) 806 807 generation.update( 808 output=response.text, 809 usage_details={ 810 "prompt_tokens": response.usage.prompt_tokens, 811 "completion_tokens": response.usage.completion_tokens 812 } 813 ) 814 finally: 815 generation.end() 816 ``` 817 """ 818 warnings.warn( 819 "start_generation is deprecated and will be removed in a future version. " 820 "Use start_observation(as_type='generation') instead.", 821 DeprecationWarning, 822 stacklevel=2, 823 ) 824 return self.start_observation( 825 trace_context=trace_context, 826 name=name, 827 as_type="generation", 828 input=input, 829 output=output, 830 metadata=metadata, 831 version=version, 832 level=level, 833 status_message=status_message, 834 completion_start_time=completion_start_time, 835 model=model, 836 model_parameters=model_parameters, 837 usage_details=usage_details, 838 cost_details=cost_details, 839 prompt=prompt, 840 ) 841 842 def start_as_current_generation( 843 self, 844 *, 845 trace_context: Optional[TraceContext] = None, 846 name: str, 847 input: Optional[Any] = None, 848 output: Optional[Any] = None, 849 metadata: Optional[Any] = None, 850 version: Optional[str] = None, 851 level: Optional[SpanLevel] = None, 852 status_message: Optional[str] = None, 853 completion_start_time: Optional[datetime] = None, 854 model: Optional[str] = None, 855 model_parameters: Optional[Dict[str, MapValue]] = None, 856 usage_details: Optional[Dict[str, int]] = None, 857 cost_details: Optional[Dict[str, float]] = None, 858 prompt: Optional[PromptClient] = None, 859 end_on_exit: Optional[bool] = None, 860 ) -> _AgnosticContextManager[LangfuseGeneration]: 861 """Create a new generation span and set it as the current span in a context manager. 862 863 DEPRECATED: This method is deprecated and will be removed in a future version. 864 Use start_as_current_observation(as_type='generation') instead. 865 866 This method creates a specialized span for model generations and sets it as the 867 current span within a context manager. Use this method with a 'with' statement to 868 automatically handle the generation span lifecycle within a code block. 869 870 The created generation span will be the child of the current span in the context. 871 872 Args: 873 trace_context: Optional context for connecting to an existing trace 874 name: Name of the generation operation 875 input: Input data for the model (e.g., prompts) 876 output: Output from the model (e.g., completions) 877 metadata: Additional metadata to associate with the generation 878 version: Version identifier for the model or component 879 level: Importance level of the generation (info, warning, error) 880 status_message: Optional status message for the generation 881 completion_start_time: When the model started generating the response 882 model: Name/identifier of the AI model used (e.g., "gpt-4") 883 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 884 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 885 cost_details: Cost information for the model call 886 prompt: Associated prompt template from Langfuse prompt management 887 end_on_exit (default: True): Whether to end the span automatically when leaving the context manager. If False, the span must be manually ended to avoid memory leaks. 888 889 Returns: 890 A context manager that yields a LangfuseGeneration 891 892 Example: 893 ```python 894 with langfuse.start_as_current_generation( 895 name="answer-generation", 896 model="gpt-4", 897 input={"prompt": "Explain quantum computing"} 898 ) as generation: 899 # Call model API 900 response = llm.generate(...) 901 902 # Update with results 903 generation.update( 904 output=response.text, 905 usage_details={ 906 "prompt_tokens": response.usage.prompt_tokens, 907 "completion_tokens": response.usage.completion_tokens 908 } 909 ) 910 ``` 911 """ 912 warnings.warn( 913 "start_as_current_generation is deprecated and will be removed in a future version. " 914 "Use start_as_current_observation(as_type='generation') instead.", 915 DeprecationWarning, 916 stacklevel=2, 917 ) 918 return self.start_as_current_observation( 919 trace_context=trace_context, 920 name=name, 921 as_type="generation", 922 input=input, 923 output=output, 924 metadata=metadata, 925 version=version, 926 level=level, 927 status_message=status_message, 928 completion_start_time=completion_start_time, 929 model=model, 930 model_parameters=model_parameters, 931 usage_details=usage_details, 932 cost_details=cost_details, 933 prompt=prompt, 934 end_on_exit=end_on_exit, 935 ) 936 937 @overload 938 def start_as_current_observation( 939 self, 940 *, 941 trace_context: Optional[TraceContext] = None, 942 name: str, 943 as_type: Literal["generation"], 944 input: Optional[Any] = None, 945 output: Optional[Any] = None, 946 metadata: Optional[Any] = None, 947 version: Optional[str] = None, 948 level: Optional[SpanLevel] = None, 949 status_message: Optional[str] = None, 950 completion_start_time: Optional[datetime] = None, 951 model: Optional[str] = None, 952 model_parameters: Optional[Dict[str, MapValue]] = None, 953 usage_details: Optional[Dict[str, int]] = None, 954 cost_details: Optional[Dict[str, float]] = None, 955 prompt: Optional[PromptClient] = None, 956 end_on_exit: Optional[bool] = None, 957 ) -> _AgnosticContextManager[LangfuseGeneration]: ... 958 959 @overload 960 def start_as_current_observation( 961 self, 962 *, 963 trace_context: Optional[TraceContext] = None, 964 name: str, 965 as_type: Literal["span"] = "span", 966 input: Optional[Any] = None, 967 output: Optional[Any] = None, 968 metadata: Optional[Any] = None, 969 version: Optional[str] = None, 970 level: Optional[SpanLevel] = None, 971 status_message: Optional[str] = None, 972 end_on_exit: Optional[bool] = None, 973 ) -> _AgnosticContextManager[LangfuseSpan]: ... 974 975 @overload 976 def start_as_current_observation( 977 self, 978 *, 979 trace_context: Optional[TraceContext] = None, 980 name: str, 981 as_type: Literal["agent"], 982 input: Optional[Any] = None, 983 output: Optional[Any] = None, 984 metadata: Optional[Any] = None, 985 version: Optional[str] = None, 986 level: Optional[SpanLevel] = None, 987 status_message: Optional[str] = None, 988 end_on_exit: Optional[bool] = None, 989 ) -> _AgnosticContextManager[LangfuseAgent]: ... 990 991 @overload 992 def start_as_current_observation( 993 self, 994 *, 995 trace_context: Optional[TraceContext] = None, 996 name: str, 997 as_type: Literal["tool"], 998 input: Optional[Any] = None, 999 output: Optional[Any] = None, 1000 metadata: Optional[Any] = None, 1001 version: Optional[str] = None, 1002 level: Optional[SpanLevel] = None, 1003 status_message: Optional[str] = None, 1004 end_on_exit: Optional[bool] = None, 1005 ) -> _AgnosticContextManager[LangfuseTool]: ... 1006 1007 @overload 1008 def start_as_current_observation( 1009 self, 1010 *, 1011 trace_context: Optional[TraceContext] = None, 1012 name: str, 1013 as_type: Literal["chain"], 1014 input: Optional[Any] = None, 1015 output: Optional[Any] = None, 1016 metadata: Optional[Any] = None, 1017 version: Optional[str] = None, 1018 level: Optional[SpanLevel] = None, 1019 status_message: Optional[str] = None, 1020 end_on_exit: Optional[bool] = None, 1021 ) -> _AgnosticContextManager[LangfuseChain]: ... 1022 1023 @overload 1024 def start_as_current_observation( 1025 self, 1026 *, 1027 trace_context: Optional[TraceContext] = None, 1028 name: str, 1029 as_type: Literal["retriever"], 1030 input: Optional[Any] = None, 1031 output: Optional[Any] = None, 1032 metadata: Optional[Any] = None, 1033 version: Optional[str] = None, 1034 level: Optional[SpanLevel] = None, 1035 status_message: Optional[str] = None, 1036 end_on_exit: Optional[bool] = None, 1037 ) -> _AgnosticContextManager[LangfuseRetriever]: ... 1038 1039 @overload 1040 def start_as_current_observation( 1041 self, 1042 *, 1043 trace_context: Optional[TraceContext] = None, 1044 name: str, 1045 as_type: Literal["evaluator"], 1046 input: Optional[Any] = None, 1047 output: Optional[Any] = None, 1048 metadata: Optional[Any] = None, 1049 version: Optional[str] = None, 1050 level: Optional[SpanLevel] = None, 1051 status_message: Optional[str] = None, 1052 end_on_exit: Optional[bool] = None, 1053 ) -> _AgnosticContextManager[LangfuseEvaluator]: ... 1054 1055 @overload 1056 def start_as_current_observation( 1057 self, 1058 *, 1059 trace_context: Optional[TraceContext] = None, 1060 name: str, 1061 as_type: Literal["embedding"], 1062 input: Optional[Any] = None, 1063 output: Optional[Any] = None, 1064 metadata: Optional[Any] = None, 1065 version: Optional[str] = None, 1066 level: Optional[SpanLevel] = None, 1067 status_message: Optional[str] = None, 1068 completion_start_time: Optional[datetime] = None, 1069 model: Optional[str] = None, 1070 model_parameters: Optional[Dict[str, MapValue]] = None, 1071 usage_details: Optional[Dict[str, int]] = None, 1072 cost_details: Optional[Dict[str, float]] = None, 1073 prompt: Optional[PromptClient] = None, 1074 end_on_exit: Optional[bool] = None, 1075 ) -> _AgnosticContextManager[LangfuseEmbedding]: ... 1076 1077 @overload 1078 def start_as_current_observation( 1079 self, 1080 *, 1081 trace_context: Optional[TraceContext] = None, 1082 name: str, 1083 as_type: Literal["guardrail"], 1084 input: Optional[Any] = None, 1085 output: Optional[Any] = None, 1086 metadata: Optional[Any] = None, 1087 version: Optional[str] = None, 1088 level: Optional[SpanLevel] = None, 1089 status_message: Optional[str] = None, 1090 end_on_exit: Optional[bool] = None, 1091 ) -> _AgnosticContextManager[LangfuseGuardrail]: ... 1092 1093 def start_as_current_observation( 1094 self, 1095 *, 1096 trace_context: Optional[TraceContext] = None, 1097 name: str, 1098 as_type: ObservationTypeLiteralNoEvent = "span", 1099 input: Optional[Any] = None, 1100 output: Optional[Any] = None, 1101 metadata: Optional[Any] = None, 1102 version: Optional[str] = None, 1103 level: Optional[SpanLevel] = None, 1104 status_message: Optional[str] = None, 1105 completion_start_time: Optional[datetime] = None, 1106 model: Optional[str] = None, 1107 model_parameters: Optional[Dict[str, MapValue]] = None, 1108 usage_details: Optional[Dict[str, int]] = None, 1109 cost_details: Optional[Dict[str, float]] = None, 1110 prompt: Optional[PromptClient] = None, 1111 end_on_exit: Optional[bool] = None, 1112 ) -> Union[ 1113 _AgnosticContextManager[LangfuseGeneration], 1114 _AgnosticContextManager[LangfuseSpan], 1115 _AgnosticContextManager[LangfuseAgent], 1116 _AgnosticContextManager[LangfuseTool], 1117 _AgnosticContextManager[LangfuseChain], 1118 _AgnosticContextManager[LangfuseRetriever], 1119 _AgnosticContextManager[LangfuseEvaluator], 1120 _AgnosticContextManager[LangfuseEmbedding], 1121 _AgnosticContextManager[LangfuseGuardrail], 1122 ]: 1123 """Create a new observation and set it as the current span in a context manager. 1124 1125 This method creates a new observation of the specified type and sets it as the 1126 current span within a context manager. Use this method with a 'with' statement to 1127 automatically handle the observation lifecycle within a code block. 1128 1129 The created observation will be the child of the current span in the context. 1130 1131 Args: 1132 trace_context: Optional context for connecting to an existing trace 1133 name: Name of the observation (e.g., function or operation name) 1134 as_type: Type of observation to create (defaults to "span") 1135 input: Input data for the operation (can be any JSON-serializable object) 1136 output: Output data from the operation (can be any JSON-serializable object) 1137 metadata: Additional metadata to associate with the observation 1138 version: Version identifier for the code or component 1139 level: Importance level of the observation (info, warning, error) 1140 status_message: Optional status message for the observation 1141 end_on_exit (default: True): Whether to end the span automatically when leaving the context manager. If False, the span must be manually ended to avoid memory leaks. 1142 1143 The following parameters are available when as_type is: "generation" or "embedding". 1144 completion_start_time: When the model started generating the response 1145 model: Name/identifier of the AI model used (e.g., "gpt-4") 1146 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 1147 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 1148 cost_details: Cost information for the model call 1149 prompt: Associated prompt template from Langfuse prompt management 1150 1151 Returns: 1152 A context manager that yields the appropriate observation type based on as_type 1153 1154 Example: 1155 ```python 1156 # Create a span 1157 with langfuse.start_as_current_observation(name="process-query", as_type="span") as span: 1158 # Do work 1159 result = process_data() 1160 span.update(output=result) 1161 1162 # Create a child span automatically 1163 with span.start_as_current_span(name="sub-operation") as child_span: 1164 # Do sub-operation work 1165 child_span.update(output="sub-result") 1166 1167 # Create a tool observation 1168 with langfuse.start_as_current_observation(name="web-search", as_type="tool") as tool: 1169 # Do tool work 1170 results = search_web(query) 1171 tool.update(output=results) 1172 1173 # Create a generation observation 1174 with langfuse.start_as_current_observation( 1175 name="answer-generation", 1176 as_type="generation", 1177 model="gpt-4" 1178 ) as generation: 1179 # Generate answer 1180 response = llm.generate(...) 1181 generation.update(output=response) 1182 ``` 1183 """ 1184 if as_type in get_observation_types_list(ObservationTypeGenerationLike): 1185 if trace_context: 1186 trace_id = trace_context.get("trace_id", None) 1187 parent_span_id = trace_context.get("parent_span_id", None) 1188 1189 if trace_id: 1190 remote_parent_span = self._create_remote_parent_span( 1191 trace_id=trace_id, parent_span_id=parent_span_id 1192 ) 1193 1194 return cast( 1195 Union[ 1196 _AgnosticContextManager[LangfuseGeneration], 1197 _AgnosticContextManager[LangfuseEmbedding], 1198 ], 1199 self._create_span_with_parent_context( 1200 as_type=as_type, 1201 name=name, 1202 remote_parent_span=remote_parent_span, 1203 parent=None, 1204 end_on_exit=end_on_exit, 1205 input=input, 1206 output=output, 1207 metadata=metadata, 1208 version=version, 1209 level=level, 1210 status_message=status_message, 1211 completion_start_time=completion_start_time, 1212 model=model, 1213 model_parameters=model_parameters, 1214 usage_details=usage_details, 1215 cost_details=cost_details, 1216 prompt=prompt, 1217 ), 1218 ) 1219 1220 return cast( 1221 Union[ 1222 _AgnosticContextManager[LangfuseGeneration], 1223 _AgnosticContextManager[LangfuseEmbedding], 1224 ], 1225 self._start_as_current_otel_span_with_processed_media( 1226 as_type=as_type, 1227 name=name, 1228 end_on_exit=end_on_exit, 1229 input=input, 1230 output=output, 1231 metadata=metadata, 1232 version=version, 1233 level=level, 1234 status_message=status_message, 1235 completion_start_time=completion_start_time, 1236 model=model, 1237 model_parameters=model_parameters, 1238 usage_details=usage_details, 1239 cost_details=cost_details, 1240 prompt=prompt, 1241 ), 1242 ) 1243 1244 if as_type in get_observation_types_list(ObservationTypeSpanLike): 1245 if trace_context: 1246 trace_id = trace_context.get("trace_id", None) 1247 parent_span_id = trace_context.get("parent_span_id", None) 1248 1249 if trace_id: 1250 remote_parent_span = self._create_remote_parent_span( 1251 trace_id=trace_id, parent_span_id=parent_span_id 1252 ) 1253 1254 return cast( 1255 Union[ 1256 _AgnosticContextManager[LangfuseSpan], 1257 _AgnosticContextManager[LangfuseAgent], 1258 _AgnosticContextManager[LangfuseTool], 1259 _AgnosticContextManager[LangfuseChain], 1260 _AgnosticContextManager[LangfuseRetriever], 1261 _AgnosticContextManager[LangfuseEvaluator], 1262 _AgnosticContextManager[LangfuseGuardrail], 1263 ], 1264 self._create_span_with_parent_context( 1265 as_type=as_type, 1266 name=name, 1267 remote_parent_span=remote_parent_span, 1268 parent=None, 1269 end_on_exit=end_on_exit, 1270 input=input, 1271 output=output, 1272 metadata=metadata, 1273 version=version, 1274 level=level, 1275 status_message=status_message, 1276 ), 1277 ) 1278 1279 return cast( 1280 Union[ 1281 _AgnosticContextManager[LangfuseSpan], 1282 _AgnosticContextManager[LangfuseAgent], 1283 _AgnosticContextManager[LangfuseTool], 1284 _AgnosticContextManager[LangfuseChain], 1285 _AgnosticContextManager[LangfuseRetriever], 1286 _AgnosticContextManager[LangfuseEvaluator], 1287 _AgnosticContextManager[LangfuseGuardrail], 1288 ], 1289 self._start_as_current_otel_span_with_processed_media( 1290 as_type=as_type, 1291 name=name, 1292 end_on_exit=end_on_exit, 1293 input=input, 1294 output=output, 1295 metadata=metadata, 1296 version=version, 1297 level=level, 1298 status_message=status_message, 1299 ), 1300 ) 1301 1302 # This should never be reached since all valid types are handled above 1303 langfuse_logger.warning( 1304 f"Unknown observation type: {as_type}, falling back to span" 1305 ) 1306 return self._start_as_current_otel_span_with_processed_media( 1307 as_type="span", 1308 name=name, 1309 end_on_exit=end_on_exit, 1310 input=input, 1311 output=output, 1312 metadata=metadata, 1313 version=version, 1314 level=level, 1315 status_message=status_message, 1316 ) 1317 1318 def _get_span_class( 1319 self, 1320 as_type: ObservationTypeLiteral, 1321 ) -> Union[ 1322 Type[LangfuseAgent], 1323 Type[LangfuseTool], 1324 Type[LangfuseChain], 1325 Type[LangfuseRetriever], 1326 Type[LangfuseEvaluator], 1327 Type[LangfuseEmbedding], 1328 Type[LangfuseGuardrail], 1329 Type[LangfuseGeneration], 1330 Type[LangfuseEvent], 1331 Type[LangfuseSpan], 1332 ]: 1333 """Get the appropriate span class based on as_type.""" 1334 normalized_type = as_type.lower() 1335 1336 if normalized_type == "agent": 1337 return LangfuseAgent 1338 elif normalized_type == "tool": 1339 return LangfuseTool 1340 elif normalized_type == "chain": 1341 return LangfuseChain 1342 elif normalized_type == "retriever": 1343 return LangfuseRetriever 1344 elif normalized_type == "evaluator": 1345 return LangfuseEvaluator 1346 elif normalized_type == "embedding": 1347 return LangfuseEmbedding 1348 elif normalized_type == "guardrail": 1349 return LangfuseGuardrail 1350 elif normalized_type == "generation": 1351 return LangfuseGeneration 1352 elif normalized_type == "event": 1353 return LangfuseEvent 1354 elif normalized_type == "span": 1355 return LangfuseSpan 1356 else: 1357 return LangfuseSpan 1358 1359 @_agnosticcontextmanager 1360 def _create_span_with_parent_context( 1361 self, 1362 *, 1363 name: str, 1364 parent: Optional[otel_trace_api.Span] = None, 1365 remote_parent_span: Optional[otel_trace_api.Span] = None, 1366 as_type: ObservationTypeLiteralNoEvent, 1367 end_on_exit: Optional[bool] = None, 1368 input: Optional[Any] = None, 1369 output: Optional[Any] = None, 1370 metadata: Optional[Any] = None, 1371 version: Optional[str] = None, 1372 level: Optional[SpanLevel] = None, 1373 status_message: Optional[str] = None, 1374 completion_start_time: Optional[datetime] = None, 1375 model: Optional[str] = None, 1376 model_parameters: Optional[Dict[str, MapValue]] = None, 1377 usage_details: Optional[Dict[str, int]] = None, 1378 cost_details: Optional[Dict[str, float]] = None, 1379 prompt: Optional[PromptClient] = None, 1380 ) -> Any: 1381 parent_span = parent or cast(otel_trace_api.Span, remote_parent_span) 1382 1383 with otel_trace_api.use_span(parent_span): 1384 with self._start_as_current_otel_span_with_processed_media( 1385 name=name, 1386 as_type=as_type, 1387 end_on_exit=end_on_exit, 1388 input=input, 1389 output=output, 1390 metadata=metadata, 1391 version=version, 1392 level=level, 1393 status_message=status_message, 1394 completion_start_time=completion_start_time, 1395 model=model, 1396 model_parameters=model_parameters, 1397 usage_details=usage_details, 1398 cost_details=cost_details, 1399 prompt=prompt, 1400 ) as langfuse_span: 1401 if remote_parent_span is not None: 1402 langfuse_span._otel_span.set_attribute( 1403 LangfuseOtelSpanAttributes.AS_ROOT, True 1404 ) 1405 1406 yield langfuse_span 1407 1408 @_agnosticcontextmanager 1409 def _start_as_current_otel_span_with_processed_media( 1410 self, 1411 *, 1412 name: str, 1413 as_type: Optional[ObservationTypeLiteralNoEvent] = None, 1414 end_on_exit: Optional[bool] = None, 1415 input: Optional[Any] = None, 1416 output: Optional[Any] = None, 1417 metadata: Optional[Any] = None, 1418 version: Optional[str] = None, 1419 level: Optional[SpanLevel] = None, 1420 status_message: Optional[str] = None, 1421 completion_start_time: Optional[datetime] = None, 1422 model: Optional[str] = None, 1423 model_parameters: Optional[Dict[str, MapValue]] = None, 1424 usage_details: Optional[Dict[str, int]] = None, 1425 cost_details: Optional[Dict[str, float]] = None, 1426 prompt: Optional[PromptClient] = None, 1427 ) -> Any: 1428 with self._otel_tracer.start_as_current_span( 1429 name=name, 1430 end_on_exit=end_on_exit if end_on_exit is not None else True, 1431 ) as otel_span: 1432 span_class = self._get_span_class( 1433 as_type or "generation" 1434 ) # default was "generation" 1435 common_args = { 1436 "otel_span": otel_span, 1437 "langfuse_client": self, 1438 "environment": self._environment, 1439 "input": input, 1440 "output": output, 1441 "metadata": metadata, 1442 "version": version, 1443 "level": level, 1444 "status_message": status_message, 1445 } 1446 1447 if span_class in [ 1448 LangfuseGeneration, 1449 LangfuseEmbedding, 1450 ]: 1451 common_args.update( 1452 { 1453 "completion_start_time": completion_start_time, 1454 "model": model, 1455 "model_parameters": model_parameters, 1456 "usage_details": usage_details, 1457 "cost_details": cost_details, 1458 "prompt": prompt, 1459 } 1460 ) 1461 # For span-like types (span, agent, tool, chain, retriever, evaluator, guardrail), no generation properties needed 1462 1463 yield span_class(**common_args) # type: ignore[arg-type] 1464 1465 def _get_current_otel_span(self) -> Optional[otel_trace_api.Span]: 1466 current_span = otel_trace_api.get_current_span() 1467 1468 if current_span is otel_trace_api.INVALID_SPAN: 1469 langfuse_logger.warning( 1470 "Context error: No active span in current context. Operations that depend on an active span will be skipped. " 1471 "Ensure spans are created with start_as_current_span() or that you're operating within an active span context." 1472 ) 1473 return None 1474 1475 return current_span 1476 1477 def update_current_generation( 1478 self, 1479 *, 1480 name: Optional[str] = None, 1481 input: Optional[Any] = None, 1482 output: Optional[Any] = None, 1483 metadata: Optional[Any] = None, 1484 version: Optional[str] = None, 1485 level: Optional[SpanLevel] = None, 1486 status_message: Optional[str] = None, 1487 completion_start_time: Optional[datetime] = None, 1488 model: Optional[str] = None, 1489 model_parameters: Optional[Dict[str, MapValue]] = None, 1490 usage_details: Optional[Dict[str, int]] = None, 1491 cost_details: Optional[Dict[str, float]] = None, 1492 prompt: Optional[PromptClient] = None, 1493 ) -> None: 1494 """Update the current active generation span with new information. 1495 1496 This method updates the current generation span in the active context with 1497 additional information. It's useful for adding output, usage stats, or other 1498 details that become available during or after model generation. 1499 1500 Args: 1501 name: The generation name 1502 input: Updated input data for the model 1503 output: Output from the model (e.g., completions) 1504 metadata: Additional metadata to associate with the generation 1505 version: Version identifier for the model or component 1506 level: Importance level of the generation (info, warning, error) 1507 status_message: Optional status message for the generation 1508 completion_start_time: When the model started generating the response 1509 model: Name/identifier of the AI model used (e.g., "gpt-4") 1510 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 1511 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 1512 cost_details: Cost information for the model call 1513 prompt: Associated prompt template from Langfuse prompt management 1514 1515 Example: 1516 ```python 1517 with langfuse.start_as_current_generation(name="answer-query") as generation: 1518 # Initial setup and API call 1519 response = llm.generate(...) 1520 1521 # Update with results that weren't available at creation time 1522 langfuse.update_current_generation( 1523 output=response.text, 1524 usage_details={ 1525 "prompt_tokens": response.usage.prompt_tokens, 1526 "completion_tokens": response.usage.completion_tokens 1527 } 1528 ) 1529 ``` 1530 """ 1531 if not self._tracing_enabled: 1532 langfuse_logger.debug( 1533 "Operation skipped: update_current_generation - Tracing is disabled or client is in no-op mode." 1534 ) 1535 return 1536 1537 current_otel_span = self._get_current_otel_span() 1538 1539 if current_otel_span is not None: 1540 generation = LangfuseGeneration( 1541 otel_span=current_otel_span, langfuse_client=self 1542 ) 1543 1544 if name: 1545 current_otel_span.update_name(name) 1546 1547 generation.update( 1548 input=input, 1549 output=output, 1550 metadata=metadata, 1551 version=version, 1552 level=level, 1553 status_message=status_message, 1554 completion_start_time=completion_start_time, 1555 model=model, 1556 model_parameters=model_parameters, 1557 usage_details=usage_details, 1558 cost_details=cost_details, 1559 prompt=prompt, 1560 ) 1561 1562 def update_current_span( 1563 self, 1564 *, 1565 name: Optional[str] = None, 1566 input: Optional[Any] = None, 1567 output: Optional[Any] = None, 1568 metadata: Optional[Any] = None, 1569 version: Optional[str] = None, 1570 level: Optional[SpanLevel] = None, 1571 status_message: Optional[str] = None, 1572 ) -> None: 1573 """Update the current active span with new information. 1574 1575 This method updates the current span in the active context with 1576 additional information. It's useful for adding outputs or metadata 1577 that become available during execution. 1578 1579 Args: 1580 name: The span name 1581 input: Updated input data for the operation 1582 output: Output data from the operation 1583 metadata: Additional metadata to associate with the span 1584 version: Version identifier for the code or component 1585 level: Importance level of the span (info, warning, error) 1586 status_message: Optional status message for the span 1587 1588 Example: 1589 ```python 1590 with langfuse.start_as_current_span(name="process-data") as span: 1591 # Initial processing 1592 result = process_first_part() 1593 1594 # Update with intermediate results 1595 langfuse.update_current_span(metadata={"intermediate_result": result}) 1596 1597 # Continue processing 1598 final_result = process_second_part(result) 1599 1600 # Final update 1601 langfuse.update_current_span(output=final_result) 1602 ``` 1603 """ 1604 if not self._tracing_enabled: 1605 langfuse_logger.debug( 1606 "Operation skipped: update_current_span - Tracing is disabled or client is in no-op mode." 1607 ) 1608 return 1609 1610 current_otel_span = self._get_current_otel_span() 1611 1612 if current_otel_span is not None: 1613 span = LangfuseSpan( 1614 otel_span=current_otel_span, 1615 langfuse_client=self, 1616 environment=self._environment, 1617 ) 1618 1619 if name: 1620 current_otel_span.update_name(name) 1621 1622 span.update( 1623 input=input, 1624 output=output, 1625 metadata=metadata, 1626 version=version, 1627 level=level, 1628 status_message=status_message, 1629 ) 1630 1631 def update_current_trace( 1632 self, 1633 *, 1634 name: Optional[str] = None, 1635 user_id: Optional[str] = None, 1636 session_id: Optional[str] = None, 1637 version: Optional[str] = None, 1638 input: Optional[Any] = None, 1639 output: Optional[Any] = None, 1640 metadata: Optional[Any] = None, 1641 tags: Optional[List[str]] = None, 1642 public: Optional[bool] = None, 1643 ) -> None: 1644 """Update the current trace with additional information. 1645 1646 Args: 1647 name: Updated name for the Langfuse trace 1648 user_id: ID of the user who initiated the Langfuse trace 1649 session_id: Session identifier for grouping related Langfuse traces 1650 version: Version identifier for the application or service 1651 input: Input data for the overall Langfuse trace 1652 output: Output data from the overall Langfuse trace 1653 metadata: Additional metadata to associate with the Langfuse trace 1654 tags: List of tags to categorize the Langfuse trace 1655 public: Whether the Langfuse trace should be publicly accessible 1656 1657 See Also: 1658 :func:`langfuse.propagate_attributes`: Recommended replacement 1659 """ 1660 if not self._tracing_enabled: 1661 langfuse_logger.debug( 1662 "Operation skipped: update_current_trace - Tracing is disabled or client is in no-op mode." 1663 ) 1664 return 1665 1666 current_otel_span = self._get_current_otel_span() 1667 1668 if current_otel_span is not None: 1669 existing_observation_type = current_otel_span.attributes.get( # type: ignore[attr-defined] 1670 LangfuseOtelSpanAttributes.OBSERVATION_TYPE, "span" 1671 ) 1672 # We need to preserve the class to keep the correct observation type 1673 span_class = self._get_span_class(existing_observation_type) 1674 span = span_class( 1675 otel_span=current_otel_span, 1676 langfuse_client=self, 1677 environment=self._environment, 1678 ) 1679 1680 span.update_trace( 1681 name=name, 1682 user_id=user_id, 1683 session_id=session_id, 1684 version=version, 1685 input=input, 1686 output=output, 1687 metadata=metadata, 1688 tags=tags, 1689 public=public, 1690 ) 1691 1692 def create_event( 1693 self, 1694 *, 1695 trace_context: Optional[TraceContext] = None, 1696 name: str, 1697 input: Optional[Any] = None, 1698 output: Optional[Any] = None, 1699 metadata: Optional[Any] = None, 1700 version: Optional[str] = None, 1701 level: Optional[SpanLevel] = None, 1702 status_message: Optional[str] = None, 1703 ) -> LangfuseEvent: 1704 """Create a new Langfuse observation of type 'EVENT'. 1705 1706 The created Langfuse Event observation will be the child of the current span in the context. 1707 1708 Args: 1709 trace_context: Optional context for connecting to an existing trace 1710 name: Name of the span (e.g., function or operation name) 1711 input: Input data for the operation (can be any JSON-serializable object) 1712 output: Output data from the operation (can be any JSON-serializable object) 1713 metadata: Additional metadata to associate with the span 1714 version: Version identifier for the code or component 1715 level: Importance level of the span (info, warning, error) 1716 status_message: Optional status message for the span 1717 1718 Returns: 1719 The Langfuse Event object 1720 1721 Example: 1722 ```python 1723 event = langfuse.create_event(name="process-event") 1724 ``` 1725 """ 1726 timestamp = time_ns() 1727 1728 if trace_context: 1729 trace_id = trace_context.get("trace_id", None) 1730 parent_span_id = trace_context.get("parent_span_id", None) 1731 1732 if trace_id: 1733 remote_parent_span = self._create_remote_parent_span( 1734 trace_id=trace_id, parent_span_id=parent_span_id 1735 ) 1736 1737 with otel_trace_api.use_span( 1738 cast(otel_trace_api.Span, remote_parent_span) 1739 ): 1740 otel_span = self._otel_tracer.start_span( 1741 name=name, start_time=timestamp 1742 ) 1743 otel_span.set_attribute(LangfuseOtelSpanAttributes.AS_ROOT, True) 1744 1745 return cast( 1746 LangfuseEvent, 1747 LangfuseEvent( 1748 otel_span=otel_span, 1749 langfuse_client=self, 1750 environment=self._environment, 1751 input=input, 1752 output=output, 1753 metadata=metadata, 1754 version=version, 1755 level=level, 1756 status_message=status_message, 1757 ).end(end_time=timestamp), 1758 ) 1759 1760 otel_span = self._otel_tracer.start_span(name=name, start_time=timestamp) 1761 1762 return cast( 1763 LangfuseEvent, 1764 LangfuseEvent( 1765 otel_span=otel_span, 1766 langfuse_client=self, 1767 environment=self._environment, 1768 input=input, 1769 output=output, 1770 metadata=metadata, 1771 version=version, 1772 level=level, 1773 status_message=status_message, 1774 ).end(end_time=timestamp), 1775 ) 1776 1777 def _create_remote_parent_span( 1778 self, *, trace_id: str, parent_span_id: Optional[str] 1779 ) -> Any: 1780 if not self._is_valid_trace_id(trace_id): 1781 langfuse_logger.warning( 1782 f"Passed trace ID '{trace_id}' is not a valid 32 lowercase hex char Langfuse trace id. Ignoring trace ID." 1783 ) 1784 1785 if parent_span_id and not self._is_valid_span_id(parent_span_id): 1786 langfuse_logger.warning( 1787 f"Passed span ID '{parent_span_id}' is not a valid 16 lowercase hex char Langfuse span id. Ignoring parent span ID." 1788 ) 1789 1790 int_trace_id = int(trace_id, 16) 1791 int_parent_span_id = ( 1792 int(parent_span_id, 16) 1793 if parent_span_id 1794 else RandomIdGenerator().generate_span_id() 1795 ) 1796 1797 span_context = otel_trace_api.SpanContext( 1798 trace_id=int_trace_id, 1799 span_id=int_parent_span_id, 1800 trace_flags=otel_trace_api.TraceFlags(0x01), # mark span as sampled 1801 is_remote=False, 1802 ) 1803 1804 return otel_trace_api.NonRecordingSpan(span_context) 1805 1806 def _is_valid_trace_id(self, trace_id: str) -> bool: 1807 pattern = r"^[0-9a-f]{32}$" 1808 1809 return bool(re.match(pattern, trace_id)) 1810 1811 def _is_valid_span_id(self, span_id: str) -> bool: 1812 pattern = r"^[0-9a-f]{16}$" 1813 1814 return bool(re.match(pattern, span_id)) 1815 1816 def _create_observation_id(self, *, seed: Optional[str] = None) -> str: 1817 """Create a unique observation ID for use with Langfuse. 1818 1819 This method generates a unique observation ID (span ID in OpenTelemetry terms) 1820 for use with various Langfuse APIs. It can either generate a random ID or 1821 create a deterministic ID based on a seed string. 1822 1823 Observation IDs must be 16 lowercase hexadecimal characters, representing 8 bytes. 1824 This method ensures the generated ID meets this requirement. If you need to 1825 correlate an external ID with a Langfuse observation ID, use the external ID as 1826 the seed to get a valid, deterministic observation ID. 1827 1828 Args: 1829 seed: Optional string to use as a seed for deterministic ID generation. 1830 If provided, the same seed will always produce the same ID. 1831 If not provided, a random ID will be generated. 1832 1833 Returns: 1834 A 16-character lowercase hexadecimal string representing the observation ID. 1835 1836 Example: 1837 ```python 1838 # Generate a random observation ID 1839 obs_id = langfuse.create_observation_id() 1840 1841 # Generate a deterministic ID based on a seed 1842 user_obs_id = langfuse.create_observation_id(seed="user-123-feedback") 1843 1844 # Correlate an external item ID with a Langfuse observation ID 1845 item_id = "item-789012" 1846 correlated_obs_id = langfuse.create_observation_id(seed=item_id) 1847 1848 # Use the ID with Langfuse APIs 1849 langfuse.create_score( 1850 name="relevance", 1851 value=0.95, 1852 trace_id=trace_id, 1853 observation_id=obs_id 1854 ) 1855 ``` 1856 """ 1857 if not seed: 1858 span_id_int = RandomIdGenerator().generate_span_id() 1859 1860 return self._format_otel_span_id(span_id_int) 1861 1862 return sha256(seed.encode("utf-8")).digest()[:8].hex() 1863 1864 @staticmethod 1865 def create_trace_id(*, seed: Optional[str] = None) -> str: 1866 """Create a unique trace ID for use with Langfuse. 1867 1868 This method generates a unique trace ID for use with various Langfuse APIs. 1869 It can either generate a random ID or create a deterministic ID based on 1870 a seed string. 1871 1872 Trace IDs must be 32 lowercase hexadecimal characters, representing 16 bytes. 1873 This method ensures the generated ID meets this requirement. If you need to 1874 correlate an external ID with a Langfuse trace ID, use the external ID as the 1875 seed to get a valid, deterministic Langfuse trace ID. 1876 1877 Args: 1878 seed: Optional string to use as a seed for deterministic ID generation. 1879 If provided, the same seed will always produce the same ID. 1880 If not provided, a random ID will be generated. 1881 1882 Returns: 1883 A 32-character lowercase hexadecimal string representing the Langfuse trace ID. 1884 1885 Example: 1886 ```python 1887 # Generate a random trace ID 1888 trace_id = langfuse.create_trace_id() 1889 1890 # Generate a deterministic ID based on a seed 1891 session_trace_id = langfuse.create_trace_id(seed="session-456") 1892 1893 # Correlate an external ID with a Langfuse trace ID 1894 external_id = "external-system-123456" 1895 correlated_trace_id = langfuse.create_trace_id(seed=external_id) 1896 1897 # Use the ID with trace context 1898 with langfuse.start_as_current_span( 1899 name="process-request", 1900 trace_context={"trace_id": trace_id} 1901 ) as span: 1902 # Operation will be part of the specific trace 1903 pass 1904 ``` 1905 """ 1906 if not seed: 1907 trace_id_int = RandomIdGenerator().generate_trace_id() 1908 1909 return Langfuse._format_otel_trace_id(trace_id_int) 1910 1911 return sha256(seed.encode("utf-8")).digest()[:16].hex() 1912 1913 def _get_otel_trace_id(self, otel_span: otel_trace_api.Span) -> str: 1914 span_context = otel_span.get_span_context() 1915 1916 return self._format_otel_trace_id(span_context.trace_id) 1917 1918 def _get_otel_span_id(self, otel_span: otel_trace_api.Span) -> str: 1919 span_context = otel_span.get_span_context() 1920 1921 return self._format_otel_span_id(span_context.span_id) 1922 1923 @staticmethod 1924 def _format_otel_span_id(span_id_int: int) -> str: 1925 """Format an integer span ID to a 16-character lowercase hex string. 1926 1927 Internal method to convert an OpenTelemetry integer span ID to the standard 1928 W3C Trace Context format (16-character lowercase hex string). 1929 1930 Args: 1931 span_id_int: 64-bit integer representing a span ID 1932 1933 Returns: 1934 A 16-character lowercase hexadecimal string 1935 """ 1936 return format(span_id_int, "016x") 1937 1938 @staticmethod 1939 def _format_otel_trace_id(trace_id_int: int) -> str: 1940 """Format an integer trace ID to a 32-character lowercase hex string. 1941 1942 Internal method to convert an OpenTelemetry integer trace ID to the standard 1943 W3C Trace Context format (32-character lowercase hex string). 1944 1945 Args: 1946 trace_id_int: 128-bit integer representing a trace ID 1947 1948 Returns: 1949 A 32-character lowercase hexadecimal string 1950 """ 1951 return format(trace_id_int, "032x") 1952 1953 @overload 1954 def create_score( 1955 self, 1956 *, 1957 name: str, 1958 value: float, 1959 session_id: Optional[str] = None, 1960 dataset_run_id: Optional[str] = None, 1961 trace_id: Optional[str] = None, 1962 observation_id: Optional[str] = None, 1963 score_id: Optional[str] = None, 1964 data_type: Optional[Literal["NUMERIC", "BOOLEAN"]] = None, 1965 comment: Optional[str] = None, 1966 config_id: Optional[str] = None, 1967 metadata: Optional[Any] = None, 1968 ) -> None: ... 1969 1970 @overload 1971 def create_score( 1972 self, 1973 *, 1974 name: str, 1975 value: str, 1976 session_id: Optional[str] = None, 1977 dataset_run_id: Optional[str] = None, 1978 trace_id: Optional[str] = None, 1979 score_id: Optional[str] = None, 1980 observation_id: Optional[str] = None, 1981 data_type: Optional[Literal["CATEGORICAL"]] = "CATEGORICAL", 1982 comment: Optional[str] = None, 1983 config_id: Optional[str] = None, 1984 metadata: Optional[Any] = None, 1985 ) -> None: ... 1986 1987 def create_score( 1988 self, 1989 *, 1990 name: str, 1991 value: Union[float, str], 1992 session_id: Optional[str] = None, 1993 dataset_run_id: Optional[str] = None, 1994 trace_id: Optional[str] = None, 1995 observation_id: Optional[str] = None, 1996 score_id: Optional[str] = None, 1997 data_type: Optional[ScoreDataType] = None, 1998 comment: Optional[str] = None, 1999 config_id: Optional[str] = None, 2000 metadata: Optional[Any] = None, 2001 ) -> None: 2002 """Create a score for a specific trace or observation. 2003 2004 This method creates a score for evaluating a Langfuse trace or observation. Scores can be 2005 used to track quality metrics, user feedback, or automated evaluations. 2006 2007 Args: 2008 name: Name of the score (e.g., "relevance", "accuracy") 2009 value: Score value (can be numeric for NUMERIC/BOOLEAN types or string for CATEGORICAL) 2010 session_id: ID of the Langfuse session to associate the score with 2011 dataset_run_id: ID of the Langfuse dataset run to associate the score with 2012 trace_id: ID of the Langfuse trace to associate the score with 2013 observation_id: Optional ID of the specific observation to score. Trace ID must be provided too. 2014 score_id: Optional custom ID for the score (auto-generated if not provided) 2015 data_type: Type of score (NUMERIC, BOOLEAN, or CATEGORICAL) 2016 comment: Optional comment or explanation for the score 2017 config_id: Optional ID of a score config defined in Langfuse 2018 metadata: Optional metadata to be attached to the score 2019 2020 Example: 2021 ```python 2022 # Create a numeric score for accuracy 2023 langfuse.create_score( 2024 name="accuracy", 2025 value=0.92, 2026 trace_id="abcdef1234567890abcdef1234567890", 2027 data_type="NUMERIC", 2028 comment="High accuracy with minor irrelevant details" 2029 ) 2030 2031 # Create a categorical score for sentiment 2032 langfuse.create_score( 2033 name="sentiment", 2034 value="positive", 2035 trace_id="abcdef1234567890abcdef1234567890", 2036 observation_id="abcdef1234567890", 2037 data_type="CATEGORICAL" 2038 ) 2039 ``` 2040 """ 2041 if not self._tracing_enabled: 2042 return 2043 2044 score_id = score_id or self._create_observation_id() 2045 2046 try: 2047 new_body = ScoreBody( 2048 id=score_id, 2049 sessionId=session_id, 2050 datasetRunId=dataset_run_id, 2051 traceId=trace_id, 2052 observationId=observation_id, 2053 name=name, 2054 value=value, 2055 dataType=data_type, # type: ignore 2056 comment=comment, 2057 configId=config_id, 2058 environment=self._environment, 2059 metadata=metadata, 2060 ) 2061 2062 event = { 2063 "id": self.create_trace_id(), 2064 "type": "score-create", 2065 "timestamp": _get_timestamp(), 2066 "body": new_body, 2067 } 2068 2069 if self._resources is not None: 2070 # Force the score to be in sample if it was for a legacy trace ID, i.e. non-32 hexchar 2071 force_sample = ( 2072 not self._is_valid_trace_id(trace_id) if trace_id else True 2073 ) 2074 2075 self._resources.add_score_task( 2076 event, 2077 force_sample=force_sample, 2078 ) 2079 2080 except Exception as e: 2081 langfuse_logger.exception( 2082 f"Error creating score: Failed to process score event for trace_id={trace_id}, name={name}. Error: {e}" 2083 ) 2084 2085 @overload 2086 def score_current_span( 2087 self, 2088 *, 2089 name: str, 2090 value: float, 2091 score_id: Optional[str] = None, 2092 data_type: Optional[Literal["NUMERIC", "BOOLEAN"]] = None, 2093 comment: Optional[str] = None, 2094 config_id: Optional[str] = None, 2095 ) -> None: ... 2096 2097 @overload 2098 def score_current_span( 2099 self, 2100 *, 2101 name: str, 2102 value: str, 2103 score_id: Optional[str] = None, 2104 data_type: Optional[Literal["CATEGORICAL"]] = "CATEGORICAL", 2105 comment: Optional[str] = None, 2106 config_id: Optional[str] = None, 2107 ) -> None: ... 2108 2109 def score_current_span( 2110 self, 2111 *, 2112 name: str, 2113 value: Union[float, str], 2114 score_id: Optional[str] = None, 2115 data_type: Optional[ScoreDataType] = None, 2116 comment: Optional[str] = None, 2117 config_id: Optional[str] = None, 2118 ) -> None: 2119 """Create a score for the current active span. 2120 2121 This method scores the currently active span in the context. It's a convenient 2122 way to score the current operation without needing to know its trace and span IDs. 2123 2124 Args: 2125 name: Name of the score (e.g., "relevance", "accuracy") 2126 value: Score value (can be numeric for NUMERIC/BOOLEAN types or string for CATEGORICAL) 2127 score_id: Optional custom ID for the score (auto-generated if not provided) 2128 data_type: Type of score (NUMERIC, BOOLEAN, or CATEGORICAL) 2129 comment: Optional comment or explanation for the score 2130 config_id: Optional ID of a score config defined in Langfuse 2131 2132 Example: 2133 ```python 2134 with langfuse.start_as_current_generation(name="answer-query") as generation: 2135 # Generate answer 2136 response = generate_answer(...) 2137 generation.update(output=response) 2138 2139 # Score the generation 2140 langfuse.score_current_span( 2141 name="relevance", 2142 value=0.85, 2143 data_type="NUMERIC", 2144 comment="Mostly relevant but contains some tangential information" 2145 ) 2146 ``` 2147 """ 2148 current_span = self._get_current_otel_span() 2149 2150 if current_span is not None: 2151 trace_id = self._get_otel_trace_id(current_span) 2152 observation_id = self._get_otel_span_id(current_span) 2153 2154 langfuse_logger.info( 2155 f"Score: Creating score name='{name}' value={value} for current span ({observation_id}) in trace {trace_id}" 2156 ) 2157 2158 self.create_score( 2159 trace_id=trace_id, 2160 observation_id=observation_id, 2161 name=name, 2162 value=cast(str, value), 2163 score_id=score_id, 2164 data_type=cast(Literal["CATEGORICAL"], data_type), 2165 comment=comment, 2166 config_id=config_id, 2167 ) 2168 2169 @overload 2170 def score_current_trace( 2171 self, 2172 *, 2173 name: str, 2174 value: float, 2175 score_id: Optional[str] = None, 2176 data_type: Optional[Literal["NUMERIC", "BOOLEAN"]] = None, 2177 comment: Optional[str] = None, 2178 config_id: Optional[str] = None, 2179 ) -> None: ... 2180 2181 @overload 2182 def score_current_trace( 2183 self, 2184 *, 2185 name: str, 2186 value: str, 2187 score_id: Optional[str] = None, 2188 data_type: Optional[Literal["CATEGORICAL"]] = "CATEGORICAL", 2189 comment: Optional[str] = None, 2190 config_id: Optional[str] = None, 2191 ) -> None: ... 2192 2193 def score_current_trace( 2194 self, 2195 *, 2196 name: str, 2197 value: Union[float, str], 2198 score_id: Optional[str] = None, 2199 data_type: Optional[ScoreDataType] = None, 2200 comment: Optional[str] = None, 2201 config_id: Optional[str] = None, 2202 ) -> None: 2203 """Create a score for the current trace. 2204 2205 This method scores the trace of the currently active span. Unlike score_current_span, 2206 this method associates the score with the entire trace rather than a specific span. 2207 It's useful for scoring overall performance or quality of the entire operation. 2208 2209 Args: 2210 name: Name of the score (e.g., "user_satisfaction", "overall_quality") 2211 value: Score value (can be numeric for NUMERIC/BOOLEAN types or string for CATEGORICAL) 2212 score_id: Optional custom ID for the score (auto-generated if not provided) 2213 data_type: Type of score (NUMERIC, BOOLEAN, or CATEGORICAL) 2214 comment: Optional comment or explanation for the score 2215 config_id: Optional ID of a score config defined in Langfuse 2216 2217 Example: 2218 ```python 2219 with langfuse.start_as_current_span(name="process-user-request") as span: 2220 # Process request 2221 result = process_complete_request() 2222 span.update(output=result) 2223 2224 # Score the overall trace 2225 langfuse.score_current_trace( 2226 name="overall_quality", 2227 value=0.95, 2228 data_type="NUMERIC", 2229 comment="High quality end-to-end response" 2230 ) 2231 ``` 2232 """ 2233 current_span = self._get_current_otel_span() 2234 2235 if current_span is not None: 2236 trace_id = self._get_otel_trace_id(current_span) 2237 2238 langfuse_logger.info( 2239 f"Score: Creating score name='{name}' value={value} for entire trace {trace_id}" 2240 ) 2241 2242 self.create_score( 2243 trace_id=trace_id, 2244 name=name, 2245 value=cast(str, value), 2246 score_id=score_id, 2247 data_type=cast(Literal["CATEGORICAL"], data_type), 2248 comment=comment, 2249 config_id=config_id, 2250 ) 2251 2252 def flush(self) -> None: 2253 """Force flush all pending spans and events to the Langfuse API. 2254 2255 This method manually flushes any pending spans, scores, and other events to the 2256 Langfuse API. It's useful in scenarios where you want to ensure all data is sent 2257 before proceeding, without waiting for the automatic flush interval. 2258 2259 Example: 2260 ```python 2261 # Record some spans and scores 2262 with langfuse.start_as_current_span(name="operation") as span: 2263 # Do work... 2264 pass 2265 2266 # Ensure all data is sent to Langfuse before proceeding 2267 langfuse.flush() 2268 2269 # Continue with other work 2270 ``` 2271 """ 2272 if self._resources is not None: 2273 self._resources.flush() 2274 2275 def shutdown(self) -> None: 2276 """Shut down the Langfuse client and flush all pending data. 2277 2278 This method cleanly shuts down the Langfuse client, ensuring all pending data 2279 is flushed to the API and all background threads are properly terminated. 2280 2281 It's important to call this method when your application is shutting down to 2282 prevent data loss and resource leaks. For most applications, using the client 2283 as a context manager or relying on the automatic shutdown via atexit is sufficient. 2284 2285 Example: 2286 ```python 2287 # Initialize Langfuse 2288 langfuse = Langfuse(public_key="...", secret_key="...") 2289 2290 # Use Langfuse throughout your application 2291 # ... 2292 2293 # When application is shutting down 2294 langfuse.shutdown() 2295 ``` 2296 """ 2297 if self._resources is not None: 2298 self._resources.shutdown() 2299 2300 def get_current_trace_id(self) -> Optional[str]: 2301 """Get the trace ID of the current active span. 2302 2303 This method retrieves the trace ID from the currently active span in the context. 2304 It can be used to get the trace ID for referencing in logs, external systems, 2305 or for creating related operations. 2306 2307 Returns: 2308 The current trace ID as a 32-character lowercase hexadecimal string, 2309 or None if there is no active span. 2310 2311 Example: 2312 ```python 2313 with langfuse.start_as_current_span(name="process-request") as span: 2314 # Get the current trace ID for reference 2315 trace_id = langfuse.get_current_trace_id() 2316 2317 # Use it for external correlation 2318 log.info(f"Processing request with trace_id: {trace_id}") 2319 2320 # Or pass to another system 2321 external_system.process(data, trace_id=trace_id) 2322 ``` 2323 """ 2324 if not self._tracing_enabled: 2325 langfuse_logger.debug( 2326 "Operation skipped: get_current_trace_id - Tracing is disabled or client is in no-op mode." 2327 ) 2328 return None 2329 2330 current_otel_span = self._get_current_otel_span() 2331 2332 return self._get_otel_trace_id(current_otel_span) if current_otel_span else None 2333 2334 def get_current_observation_id(self) -> Optional[str]: 2335 """Get the observation ID (span ID) of the current active span. 2336 2337 This method retrieves the observation ID from the currently active span in the context. 2338 It can be used to get the observation ID for referencing in logs, external systems, 2339 or for creating scores or other related operations. 2340 2341 Returns: 2342 The current observation ID as a 16-character lowercase hexadecimal string, 2343 or None if there is no active span. 2344 2345 Example: 2346 ```python 2347 with langfuse.start_as_current_span(name="process-user-query") as span: 2348 # Get the current observation ID 2349 observation_id = langfuse.get_current_observation_id() 2350 2351 # Store it for later reference 2352 cache.set(f"query_{query_id}_observation", observation_id) 2353 2354 # Process the query... 2355 ``` 2356 """ 2357 if not self._tracing_enabled: 2358 langfuse_logger.debug( 2359 "Operation skipped: get_current_observation_id - Tracing is disabled or client is in no-op mode." 2360 ) 2361 return None 2362 2363 current_otel_span = self._get_current_otel_span() 2364 2365 return self._get_otel_span_id(current_otel_span) if current_otel_span else None 2366 2367 def _get_project_id(self) -> Optional[str]: 2368 """Fetch and return the current project id. Persisted across requests. Returns None if no project id is found for api keys.""" 2369 if not self._project_id: 2370 proj = self.api.projects.get() 2371 if not proj.data or not proj.data[0].id: 2372 return None 2373 2374 self._project_id = proj.data[0].id 2375 2376 return self._project_id 2377 2378 def get_trace_url(self, *, trace_id: Optional[str] = None) -> Optional[str]: 2379 """Get the URL to view a trace in the Langfuse UI. 2380 2381 This method generates a URL that links directly to a trace in the Langfuse UI. 2382 It's useful for providing links in logs, notifications, or debugging tools. 2383 2384 Args: 2385 trace_id: Optional trace ID to generate a URL for. If not provided, 2386 the trace ID of the current active span will be used. 2387 2388 Returns: 2389 A URL string pointing to the trace in the Langfuse UI, 2390 or None if the project ID couldn't be retrieved or no trace ID is available. 2391 2392 Example: 2393 ```python 2394 # Get URL for the current trace 2395 with langfuse.start_as_current_span(name="process-request") as span: 2396 trace_url = langfuse.get_trace_url() 2397 log.info(f"Processing trace: {trace_url}") 2398 2399 # Get URL for a specific trace 2400 specific_trace_url = langfuse.get_trace_url(trace_id="1234567890abcdef1234567890abcdef") 2401 send_notification(f"Review needed for trace: {specific_trace_url}") 2402 ``` 2403 """ 2404 project_id = self._get_project_id() 2405 final_trace_id = trace_id or self.get_current_trace_id() 2406 2407 return ( 2408 f"{self._base_url}/project/{project_id}/traces/{final_trace_id}" 2409 if project_id and final_trace_id 2410 else None 2411 ) 2412 2413 def get_dataset( 2414 self, name: str, *, fetch_items_page_size: Optional[int] = 50 2415 ) -> "DatasetClient": 2416 """Fetch a dataset by its name. 2417 2418 Args: 2419 name (str): The name of the dataset to fetch. 2420 fetch_items_page_size (Optional[int]): All items of the dataset will be fetched in chunks of this size. Defaults to 50. 2421 2422 Returns: 2423 DatasetClient: The dataset with the given name. 2424 """ 2425 try: 2426 langfuse_logger.debug(f"Getting datasets {name}") 2427 dataset = self.api.datasets.get(dataset_name=name) 2428 2429 dataset_items = [] 2430 page = 1 2431 2432 while True: 2433 new_items = self.api.dataset_items.list( 2434 dataset_name=self._url_encode(name, is_url_param=True), 2435 page=page, 2436 limit=fetch_items_page_size, 2437 ) 2438 dataset_items.extend(new_items.data) 2439 2440 if new_items.meta.total_pages <= page: 2441 break 2442 2443 page += 1 2444 2445 items = [DatasetItemClient(i, langfuse=self) for i in dataset_items] 2446 2447 return DatasetClient(dataset, items=items) 2448 2449 except Error as e: 2450 handle_fern_exception(e) 2451 raise e 2452 2453 def run_experiment( 2454 self, 2455 *, 2456 name: str, 2457 run_name: Optional[str] = None, 2458 description: Optional[str] = None, 2459 data: ExperimentData, 2460 task: TaskFunction, 2461 evaluators: List[EvaluatorFunction] = [], 2462 run_evaluators: List[RunEvaluatorFunction] = [], 2463 max_concurrency: int = 50, 2464 metadata: Optional[Dict[str, str]] = None, 2465 ) -> ExperimentResult: 2466 """Run an experiment on a dataset with automatic tracing and evaluation. 2467 2468 This method executes a task function on each item in the provided dataset, 2469 automatically traces all executions with Langfuse for observability, runs 2470 item-level and run-level evaluators on the outputs, and returns comprehensive 2471 results with evaluation metrics. 2472 2473 The experiment system provides: 2474 - Automatic tracing of all task executions 2475 - Concurrent processing with configurable limits 2476 - Comprehensive error handling that isolates failures 2477 - Integration with Langfuse datasets for experiment tracking 2478 - Flexible evaluation framework supporting both sync and async evaluators 2479 2480 Args: 2481 name: Human-readable name for the experiment. Used for identification 2482 in the Langfuse UI. 2483 run_name: Optional exact name for the experiment run. If provided, this will be 2484 used as the exact dataset run name if the `data` contains Langfuse dataset items. 2485 If not provided, this will default to the experiment name appended with an ISO timestamp. 2486 description: Optional description explaining the experiment's purpose, 2487 methodology, or expected outcomes. 2488 data: Array of data items to process. Can be either: 2489 - List of dict-like items with 'input', 'expected_output', 'metadata' keys 2490 - List of Langfuse DatasetItem objects from dataset.items 2491 task: Function that processes each data item and returns output. 2492 Must accept 'item' as keyword argument and can return sync or async results. 2493 The task function signature should be: task(*, item, **kwargs) -> Any 2494 evaluators: List of functions to evaluate each item's output individually. 2495 Each evaluator receives input, output, expected_output, and metadata. 2496 Can return single Evaluation dict or list of Evaluation dicts. 2497 run_evaluators: List of functions to evaluate the entire experiment run. 2498 Each run evaluator receives all item_results and can compute aggregate metrics. 2499 Useful for calculating averages, distributions, or cross-item comparisons. 2500 max_concurrency: Maximum number of concurrent task executions (default: 50). 2501 Controls the number of items processed simultaneously. Adjust based on 2502 API rate limits and system resources. 2503 metadata: Optional metadata dictionary to attach to all experiment traces. 2504 This metadata will be included in every trace created during the experiment. 2505 If `data` are Langfuse dataset items, the metadata will be attached to the dataset run, too. 2506 2507 Returns: 2508 ExperimentResult containing: 2509 - run_name: The experiment run name. This is equal to the dataset run name if experiment was on Langfuse dataset. 2510 - item_results: List of results for each processed item with outputs and evaluations 2511 - run_evaluations: List of aggregate evaluation results for the entire run 2512 - dataset_run_id: ID of the dataset run (if using Langfuse datasets) 2513 - dataset_run_url: Direct URL to view results in Langfuse UI (if applicable) 2514 2515 Raises: 2516 ValueError: If required parameters are missing or invalid 2517 Exception: If experiment setup fails (individual item failures are handled gracefully) 2518 2519 Examples: 2520 Basic experiment with local data: 2521 ```python 2522 def summarize_text(*, item, **kwargs): 2523 return f"Summary: {item['input'][:50]}..." 2524 2525 def length_evaluator(*, input, output, expected_output=None, **kwargs): 2526 return { 2527 "name": "output_length", 2528 "value": len(output), 2529 "comment": f"Output contains {len(output)} characters" 2530 } 2531 2532 result = langfuse.run_experiment( 2533 name="Text Summarization Test", 2534 description="Evaluate summarization quality and length", 2535 data=[ 2536 {"input": "Long article text...", "expected_output": "Expected summary"}, 2537 {"input": "Another article...", "expected_output": "Another summary"} 2538 ], 2539 task=summarize_text, 2540 evaluators=[length_evaluator] 2541 ) 2542 2543 print(f"Processed {len(result.item_results)} items") 2544 for item_result in result.item_results: 2545 print(f"Input: {item_result.item['input']}") 2546 print(f"Output: {item_result.output}") 2547 print(f"Evaluations: {item_result.evaluations}") 2548 ``` 2549 2550 Advanced experiment with async task and multiple evaluators: 2551 ```python 2552 async def llm_task(*, item, **kwargs): 2553 # Simulate async LLM call 2554 response = await openai_client.chat.completions.create( 2555 model="gpt-4", 2556 messages=[{"role": "user", "content": item["input"]}] 2557 ) 2558 return response.choices[0].message.content 2559 2560 def accuracy_evaluator(*, input, output, expected_output=None, **kwargs): 2561 if expected_output and expected_output.lower() in output.lower(): 2562 return {"name": "accuracy", "value": 1.0, "comment": "Correct answer"} 2563 return {"name": "accuracy", "value": 0.0, "comment": "Incorrect answer"} 2564 2565 def toxicity_evaluator(*, input, output, expected_output=None, **kwargs): 2566 # Simulate toxicity check 2567 toxicity_score = check_toxicity(output) # Your toxicity checker 2568 return { 2569 "name": "toxicity", 2570 "value": toxicity_score, 2571 "comment": f"Toxicity level: {'high' if toxicity_score > 0.7 else 'low'}" 2572 } 2573 2574 def average_accuracy(*, item_results, **kwargs): 2575 accuracies = [ 2576 eval.value for result in item_results 2577 for eval in result.evaluations 2578 if eval.name == "accuracy" 2579 ] 2580 return { 2581 "name": "average_accuracy", 2582 "value": sum(accuracies) / len(accuracies) if accuracies else 0, 2583 "comment": f"Average accuracy across {len(accuracies)} items" 2584 } 2585 2586 result = langfuse.run_experiment( 2587 name="LLM Safety and Accuracy Test", 2588 description="Evaluate model accuracy and safety across diverse prompts", 2589 data=test_dataset, # Your dataset items 2590 task=llm_task, 2591 evaluators=[accuracy_evaluator, toxicity_evaluator], 2592 run_evaluators=[average_accuracy], 2593 max_concurrency=5, # Limit concurrent API calls 2594 metadata={"model": "gpt-4", "temperature": 0.7} 2595 ) 2596 ``` 2597 2598 Using with Langfuse datasets: 2599 ```python 2600 # Get dataset from Langfuse 2601 dataset = langfuse.get_dataset("my-eval-dataset") 2602 2603 result = dataset.run_experiment( 2604 name="Production Model Evaluation", 2605 description="Monthly evaluation of production model performance", 2606 task=my_production_task, 2607 evaluators=[accuracy_evaluator, latency_evaluator] 2608 ) 2609 2610 # Results automatically linked to dataset in Langfuse UI 2611 print(f"View results: {result['dataset_run_url']}") 2612 ``` 2613 2614 Note: 2615 - Task and evaluator functions can be either synchronous or asynchronous 2616 - Individual item failures are logged but don't stop the experiment 2617 - All executions are automatically traced and visible in Langfuse UI 2618 - When using Langfuse datasets, results are automatically linked for easy comparison 2619 - This method works in both sync and async contexts (Jupyter notebooks, web apps, etc.) 2620 - Async execution is handled automatically with smart event loop detection 2621 """ 2622 return cast( 2623 ExperimentResult, 2624 run_async_safely( 2625 self._run_experiment_async( 2626 name=name, 2627 run_name=self._create_experiment_run_name( 2628 name=name, run_name=run_name 2629 ), 2630 description=description, 2631 data=data, 2632 task=task, 2633 evaluators=evaluators or [], 2634 run_evaluators=run_evaluators or [], 2635 max_concurrency=max_concurrency, 2636 metadata=metadata, 2637 ), 2638 ), 2639 ) 2640 2641 async def _run_experiment_async( 2642 self, 2643 *, 2644 name: str, 2645 run_name: str, 2646 description: Optional[str], 2647 data: ExperimentData, 2648 task: TaskFunction, 2649 evaluators: List[EvaluatorFunction], 2650 run_evaluators: List[RunEvaluatorFunction], 2651 max_concurrency: int, 2652 metadata: Optional[Dict[str, Any]] = None, 2653 ) -> ExperimentResult: 2654 langfuse_logger.debug( 2655 f"Starting experiment '{name}' run '{run_name}' with {len(data)} items" 2656 ) 2657 2658 # Set up concurrency control 2659 semaphore = asyncio.Semaphore(max_concurrency) 2660 2661 # Process all items 2662 async def process_item(item: ExperimentItem) -> ExperimentItemResult: 2663 async with semaphore: 2664 return await self._process_experiment_item( 2665 item, task, evaluators, name, run_name, description, metadata 2666 ) 2667 2668 # Run all items concurrently 2669 tasks = [process_item(item) for item in data] 2670 item_results = await asyncio.gather(*tasks, return_exceptions=True) 2671 2672 # Filter out any exceptions and log errors 2673 valid_results: List[ExperimentItemResult] = [] 2674 for i, result in enumerate(item_results): 2675 if isinstance(result, Exception): 2676 langfuse_logger.error(f"Item {i} failed: {result}") 2677 elif isinstance(result, ExperimentItemResult): 2678 valid_results.append(result) # type: ignore 2679 2680 # Run experiment-level evaluators 2681 run_evaluations: List[Evaluation] = [] 2682 for run_evaluator in run_evaluators: 2683 try: 2684 evaluations = await _run_evaluator( 2685 run_evaluator, item_results=valid_results 2686 ) 2687 run_evaluations.extend(evaluations) 2688 except Exception as e: 2689 langfuse_logger.error(f"Run evaluator failed: {e}") 2690 2691 # Generate dataset run URL if applicable 2692 dataset_run_id = valid_results[0].dataset_run_id if valid_results else None 2693 dataset_run_url = None 2694 if dataset_run_id and data: 2695 try: 2696 # Check if the first item has dataset_id (for DatasetItem objects) 2697 first_item = data[0] 2698 dataset_id = None 2699 2700 if hasattr(first_item, "dataset_id"): 2701 dataset_id = getattr(first_item, "dataset_id", None) 2702 2703 if dataset_id: 2704 project_id = self._get_project_id() 2705 2706 if project_id: 2707 dataset_run_url = f"{self._base_url}/project/{project_id}/datasets/{dataset_id}/runs/{dataset_run_id}" 2708 2709 except Exception: 2710 pass # URL generation is optional 2711 2712 # Store run-level evaluations as scores 2713 for evaluation in run_evaluations: 2714 try: 2715 if dataset_run_id: 2716 self.create_score( 2717 dataset_run_id=dataset_run_id, 2718 name=evaluation.name or "<unknown>", 2719 value=evaluation.value, # type: ignore 2720 comment=evaluation.comment, 2721 metadata=evaluation.metadata, 2722 data_type=evaluation.data_type, # type: ignore 2723 config_id=evaluation.config_id, 2724 ) 2725 2726 except Exception as e: 2727 langfuse_logger.error(f"Failed to store run evaluation: {e}") 2728 2729 # Flush scores and traces 2730 self.flush() 2731 2732 return ExperimentResult( 2733 name=name, 2734 run_name=run_name, 2735 description=description, 2736 item_results=valid_results, 2737 run_evaluations=run_evaluations, 2738 dataset_run_id=dataset_run_id, 2739 dataset_run_url=dataset_run_url, 2740 ) 2741 2742 async def _process_experiment_item( 2743 self, 2744 item: ExperimentItem, 2745 task: Callable, 2746 evaluators: List[Callable], 2747 experiment_name: str, 2748 experiment_run_name: str, 2749 experiment_description: Optional[str], 2750 experiment_metadata: Optional[Dict[str, Any]] = None, 2751 ) -> ExperimentItemResult: 2752 span_name = "experiment-item-run" 2753 2754 with self.start_as_current_span(name=span_name) as span: 2755 try: 2756 input_data = ( 2757 item.get("input") 2758 if isinstance(item, dict) 2759 else getattr(item, "input", None) 2760 ) 2761 2762 if input_data is None: 2763 raise ValueError("Experiment Item is missing input. Skipping item.") 2764 2765 expected_output = ( 2766 item.get("expected_output") 2767 if isinstance(item, dict) 2768 else getattr(item, "expected_output", None) 2769 ) 2770 2771 item_metadata = ( 2772 item.get("metadata") 2773 if isinstance(item, dict) 2774 else getattr(item, "metadata", None) 2775 ) 2776 2777 final_observation_metadata = { 2778 "experiment_name": experiment_name, 2779 "experiment_run_name": experiment_run_name, 2780 **(experiment_metadata or {}), 2781 } 2782 2783 trace_id = span.trace_id 2784 dataset_id = None 2785 dataset_item_id = None 2786 dataset_run_id = None 2787 2788 # Link to dataset run if this is a dataset item 2789 if hasattr(item, "id") and hasattr(item, "dataset_id"): 2790 try: 2791 # Use sync API to avoid event loop issues when run_async_safely 2792 # creates multiple event loops across different threads 2793 dataset_run_item = await asyncio.to_thread( 2794 self.api.dataset_run_items.create, 2795 request=CreateDatasetRunItemRequest( 2796 runName=experiment_run_name, 2797 runDescription=experiment_description, 2798 metadata=experiment_metadata, 2799 datasetItemId=item.id, # type: ignore 2800 traceId=trace_id, 2801 observationId=span.id, 2802 ), 2803 ) 2804 2805 dataset_run_id = dataset_run_item.dataset_run_id 2806 2807 except Exception as e: 2808 langfuse_logger.error(f"Failed to create dataset run item: {e}") 2809 2810 if ( 2811 not isinstance(item, dict) 2812 and hasattr(item, "dataset_id") 2813 and hasattr(item, "id") 2814 ): 2815 dataset_id = item.dataset_id 2816 dataset_item_id = item.id 2817 2818 final_observation_metadata.update( 2819 {"dataset_id": dataset_id, "dataset_item_id": dataset_item_id} 2820 ) 2821 2822 if isinstance(item_metadata, dict): 2823 final_observation_metadata.update(item_metadata) 2824 2825 experiment_id = dataset_run_id or self._create_observation_id() 2826 experiment_item_id = ( 2827 dataset_item_id or get_sha256_hash_hex(_serialize(input_data))[:16] 2828 ) 2829 span._otel_span.set_attributes( 2830 { 2831 k: v 2832 for k, v in { 2833 LangfuseOtelSpanAttributes.ENVIRONMENT: LANGFUSE_SDK_EXPERIMENT_ENVIRONMENT, 2834 LangfuseOtelSpanAttributes.EXPERIMENT_DESCRIPTION: experiment_description, 2835 LangfuseOtelSpanAttributes.EXPERIMENT_ITEM_EXPECTED_OUTPUT: _serialize( 2836 expected_output 2837 ), 2838 }.items() 2839 if v is not None 2840 } 2841 ) 2842 2843 with _propagate_attributes( 2844 experiment=PropagatedExperimentAttributes( 2845 experiment_id=experiment_id, 2846 experiment_name=experiment_run_name, 2847 experiment_metadata=_serialize(experiment_metadata), 2848 experiment_dataset_id=dataset_id, 2849 experiment_item_id=experiment_item_id, 2850 experiment_item_metadata=_serialize(item_metadata), 2851 experiment_item_root_observation_id=span.id, 2852 ) 2853 ): 2854 output = await _run_task(task, item) 2855 2856 span.update( 2857 input=input_data, 2858 output=output, 2859 metadata=final_observation_metadata, 2860 ) 2861 2862 # Run evaluators 2863 evaluations = [] 2864 2865 for evaluator in evaluators: 2866 try: 2867 eval_metadata: Optional[Dict[str, Any]] = None 2868 2869 if isinstance(item, dict): 2870 eval_metadata = item.get("metadata") 2871 elif hasattr(item, "metadata"): 2872 eval_metadata = item.metadata 2873 2874 eval_results = await _run_evaluator( 2875 evaluator, 2876 input=input_data, 2877 output=output, 2878 expected_output=expected_output, 2879 metadata=eval_metadata, 2880 ) 2881 evaluations.extend(eval_results) 2882 2883 # Store evaluations as scores 2884 for evaluation in eval_results: 2885 self.create_score( 2886 trace_id=trace_id, 2887 observation_id=span.id, 2888 name=evaluation.name, 2889 value=evaluation.value, # type: ignore 2890 comment=evaluation.comment, 2891 metadata=evaluation.metadata, 2892 config_id=evaluation.config_id, 2893 data_type=evaluation.data_type, # type: ignore 2894 ) 2895 2896 except Exception as e: 2897 langfuse_logger.error(f"Evaluator failed: {e}") 2898 2899 return ExperimentItemResult( 2900 item=item, 2901 output=output, 2902 evaluations=evaluations, 2903 trace_id=trace_id, 2904 dataset_run_id=dataset_run_id, 2905 ) 2906 2907 except Exception as e: 2908 span.update( 2909 output=f"Error: {str(e)}", level="ERROR", status_message=str(e) 2910 ) 2911 raise e 2912 2913 def _create_experiment_run_name( 2914 self, *, name: Optional[str] = None, run_name: Optional[str] = None 2915 ) -> str: 2916 if run_name: 2917 return run_name 2918 2919 iso_timestamp = _get_timestamp().isoformat().replace("+00:00", "Z") 2920 2921 return f"{name} - {iso_timestamp}" 2922 2923 def auth_check(self) -> bool: 2924 """Check if the provided credentials (public and secret key) are valid. 2925 2926 Raises: 2927 Exception: If no projects were found for the provided credentials. 2928 2929 Note: 2930 This method is blocking. It is discouraged to use it in production code. 2931 """ 2932 try: 2933 projects = self.api.projects.get() 2934 langfuse_logger.debug( 2935 f"Auth check successful, found {len(projects.data)} projects" 2936 ) 2937 if len(projects.data) == 0: 2938 raise Exception( 2939 "Auth check failed, no project found for the keys provided." 2940 ) 2941 return True 2942 2943 except AttributeError as e: 2944 langfuse_logger.warning( 2945 f"Auth check failed: Client not properly initialized. Error: {e}" 2946 ) 2947 return False 2948 2949 except Error as e: 2950 handle_fern_exception(e) 2951 raise e 2952 2953 def create_dataset( 2954 self, 2955 *, 2956 name: str, 2957 description: Optional[str] = None, 2958 metadata: Optional[Any] = None, 2959 ) -> Dataset: 2960 """Create a dataset with the given name on Langfuse. 2961 2962 Args: 2963 name: Name of the dataset to create. 2964 description: Description of the dataset. Defaults to None. 2965 metadata: Additional metadata. Defaults to None. 2966 2967 Returns: 2968 Dataset: The created dataset as returned by the Langfuse API. 2969 """ 2970 try: 2971 body = CreateDatasetRequest( 2972 name=name, description=description, metadata=metadata 2973 ) 2974 langfuse_logger.debug(f"Creating datasets {body}") 2975 2976 return self.api.datasets.create(request=body) 2977 2978 except Error as e: 2979 handle_fern_exception(e) 2980 raise e 2981 2982 def create_dataset_item( 2983 self, 2984 *, 2985 dataset_name: str, 2986 input: Optional[Any] = None, 2987 expected_output: Optional[Any] = None, 2988 metadata: Optional[Any] = None, 2989 source_trace_id: Optional[str] = None, 2990 source_observation_id: Optional[str] = None, 2991 status: Optional[DatasetStatus] = None, 2992 id: Optional[str] = None, 2993 ) -> DatasetItem: 2994 """Create a dataset item. 2995 2996 Upserts if an item with id already exists. 2997 2998 Args: 2999 dataset_name: Name of the dataset in which the dataset item should be created. 3000 input: Input data. Defaults to None. Can contain any dict, list or scalar. 3001 expected_output: Expected output data. Defaults to None. Can contain any dict, list or scalar. 3002 metadata: Additional metadata. Defaults to None. Can contain any dict, list or scalar. 3003 source_trace_id: Id of the source trace. Defaults to None. 3004 source_observation_id: Id of the source observation. Defaults to None. 3005 status: Status of the dataset item. Defaults to ACTIVE for newly created items. 3006 id: Id of the dataset item. Defaults to None. Provide your own id if you want to dedupe dataset items. Id needs to be globally unique and cannot be reused across datasets. 3007 3008 Returns: 3009 DatasetItem: The created dataset item as returned by the Langfuse API. 3010 3011 Example: 3012 ```python 3013 from langfuse import Langfuse 3014 3015 langfuse = Langfuse() 3016 3017 # Uploading items to the Langfuse dataset named "capital_cities" 3018 langfuse.create_dataset_item( 3019 dataset_name="capital_cities", 3020 input={"input": {"country": "Italy"}}, 3021 expected_output={"expected_output": "Rome"}, 3022 metadata={"foo": "bar"} 3023 ) 3024 ``` 3025 """ 3026 try: 3027 body = CreateDatasetItemRequest( 3028 datasetName=dataset_name, 3029 input=input, 3030 expectedOutput=expected_output, 3031 metadata=metadata, 3032 sourceTraceId=source_trace_id, 3033 sourceObservationId=source_observation_id, 3034 status=status, 3035 id=id, 3036 ) 3037 langfuse_logger.debug(f"Creating dataset item {body}") 3038 return self.api.dataset_items.create(request=body) 3039 except Error as e: 3040 handle_fern_exception(e) 3041 raise e 3042 3043 def resolve_media_references( 3044 self, 3045 *, 3046 obj: Any, 3047 resolve_with: Literal["base64_data_uri"], 3048 max_depth: int = 10, 3049 content_fetch_timeout_seconds: int = 5, 3050 ) -> Any: 3051 """Replace media reference strings in an object with base64 data URIs. 3052 3053 This method recursively traverses an object (up to max_depth) looking for media reference strings 3054 in the format "@@@langfuseMedia:...@@@". When found, it (synchronously) fetches the actual media content using 3055 the provided Langfuse client and replaces the reference string with a base64 data URI. 3056 3057 If fetching media content fails for a reference string, a warning is logged and the reference 3058 string is left unchanged. 3059 3060 Args: 3061 obj: The object to process. Can be a primitive value, array, or nested object. 3062 If the object has a __dict__ attribute, a dict will be returned instead of the original object type. 3063 resolve_with: The representation of the media content to replace the media reference string with. 3064 Currently only "base64_data_uri" is supported. 3065 max_depth: int: The maximum depth to traverse the object. Default is 10. 3066 content_fetch_timeout_seconds: int: The timeout in seconds for fetching media content. Default is 5. 3067 3068 Returns: 3069 A deep copy of the input object with all media references replaced with base64 data URIs where possible. 3070 If the input object has a __dict__ attribute, a dict will be returned instead of the original object type. 3071 3072 Example: 3073 obj = { 3074 "image": "@@@langfuseMedia:type=image/jpeg|id=123|source=bytes@@@", 3075 "nested": { 3076 "pdf": "@@@langfuseMedia:type=application/pdf|id=456|source=bytes@@@" 3077 } 3078 } 3079 3080 result = await LangfuseMedia.resolve_media_references(obj, langfuse_client) 3081 3082 # Result: 3083 # { 3084 # "image": "...", 3085 # "nested": { 3086 # "pdf": "data:application/pdf;base64,JVBERi0xLjcK..." 3087 # } 3088 # } 3089 """ 3090 return LangfuseMedia.resolve_media_references( 3091 langfuse_client=self, 3092 obj=obj, 3093 resolve_with=resolve_with, 3094 max_depth=max_depth, 3095 content_fetch_timeout_seconds=content_fetch_timeout_seconds, 3096 ) 3097 3098 @overload 3099 def get_prompt( 3100 self, 3101 name: str, 3102 *, 3103 version: Optional[int] = None, 3104 label: Optional[str] = None, 3105 type: Literal["chat"], 3106 cache_ttl_seconds: Optional[int] = None, 3107 fallback: Optional[List[ChatMessageDict]] = None, 3108 max_retries: Optional[int] = None, 3109 fetch_timeout_seconds: Optional[int] = None, 3110 ) -> ChatPromptClient: ... 3111 3112 @overload 3113 def get_prompt( 3114 self, 3115 name: str, 3116 *, 3117 version: Optional[int] = None, 3118 label: Optional[str] = None, 3119 type: Literal["text"] = "text", 3120 cache_ttl_seconds: Optional[int] = None, 3121 fallback: Optional[str] = None, 3122 max_retries: Optional[int] = None, 3123 fetch_timeout_seconds: Optional[int] = None, 3124 ) -> TextPromptClient: ... 3125 3126 def get_prompt( 3127 self, 3128 name: str, 3129 *, 3130 version: Optional[int] = None, 3131 label: Optional[str] = None, 3132 type: Literal["chat", "text"] = "text", 3133 cache_ttl_seconds: Optional[int] = None, 3134 fallback: Union[Optional[List[ChatMessageDict]], Optional[str]] = None, 3135 max_retries: Optional[int] = None, 3136 fetch_timeout_seconds: Optional[int] = None, 3137 ) -> PromptClient: 3138 """Get a prompt. 3139 3140 This method attempts to fetch the requested prompt from the local cache. If the prompt is not found 3141 in the cache or if the cached prompt has expired, it will try to fetch the prompt from the server again 3142 and update the cache. If fetching the new prompt fails, and there is an expired prompt in the cache, it will 3143 return the expired prompt as a fallback. 3144 3145 Args: 3146 name (str): The name of the prompt to retrieve. 3147 3148 Keyword Args: 3149 version (Optional[int]): The version of the prompt to retrieve. If no label and version is specified, the `production` label is returned. Specify either version or label, not both. 3150 label: Optional[str]: The label of the prompt to retrieve. If no label and version is specified, the `production` label is returned. Specify either version or label, not both. 3151 cache_ttl_seconds: Optional[int]: Time-to-live in seconds for caching the prompt. Must be specified as a 3152 keyword argument. If not set, defaults to 60 seconds. Disables caching if set to 0. 3153 type: Literal["chat", "text"]: The type of the prompt to retrieve. Defaults to "text". 3154 fallback: Union[Optional[List[ChatMessageDict]], Optional[str]]: The prompt string to return if fetching the prompt fails. Important on the first call where no cached prompt is available. Follows Langfuse prompt formatting with double curly braces for variables. Defaults to None. 3155 max_retries: Optional[int]: The maximum number of retries in case of API/network errors. Defaults to 2. The maximum value is 4. Retries have an exponential backoff with a maximum delay of 10 seconds. 3156 fetch_timeout_seconds: Optional[int]: The timeout in milliseconds for fetching the prompt. Defaults to the default timeout set on the SDK, which is 5 seconds per default. 3157 3158 Returns: 3159 The prompt object retrieved from the cache or directly fetched if not cached or expired of type 3160 - TextPromptClient, if type argument is 'text'. 3161 - ChatPromptClient, if type argument is 'chat'. 3162 3163 Raises: 3164 Exception: Propagates any exceptions raised during the fetching of a new prompt, unless there is an 3165 expired prompt in the cache, in which case it logs a warning and returns the expired prompt. 3166 """ 3167 if self._resources is None: 3168 raise Error( 3169 "SDK is not correctly initialized. Check the init logs for more details." 3170 ) 3171 if version is not None and label is not None: 3172 raise ValueError("Cannot specify both version and label at the same time.") 3173 3174 if not name: 3175 raise ValueError("Prompt name cannot be empty.") 3176 3177 cache_key = PromptCache.generate_cache_key(name, version=version, label=label) 3178 bounded_max_retries = self._get_bounded_max_retries( 3179 max_retries, default_max_retries=2, max_retries_upper_bound=4 3180 ) 3181 3182 langfuse_logger.debug(f"Getting prompt '{cache_key}'") 3183 cached_prompt = self._resources.prompt_cache.get(cache_key) 3184 3185 if cached_prompt is None or cache_ttl_seconds == 0: 3186 langfuse_logger.debug( 3187 f"Prompt '{cache_key}' not found in cache or caching disabled." 3188 ) 3189 try: 3190 return self._fetch_prompt_and_update_cache( 3191 name, 3192 version=version, 3193 label=label, 3194 ttl_seconds=cache_ttl_seconds, 3195 max_retries=bounded_max_retries, 3196 fetch_timeout_seconds=fetch_timeout_seconds, 3197 ) 3198 except Exception as e: 3199 if fallback: 3200 langfuse_logger.warning( 3201 f"Returning fallback prompt for '{cache_key}' due to fetch error: {e}" 3202 ) 3203 3204 fallback_client_args: Dict[str, Any] = { 3205 "name": name, 3206 "prompt": fallback, 3207 "type": type, 3208 "version": version or 0, 3209 "config": {}, 3210 "labels": [label] if label else [], 3211 "tags": [], 3212 } 3213 3214 if type == "text": 3215 return TextPromptClient( 3216 prompt=Prompt_Text(**fallback_client_args), 3217 is_fallback=True, 3218 ) 3219 3220 if type == "chat": 3221 return ChatPromptClient( 3222 prompt=Prompt_Chat(**fallback_client_args), 3223 is_fallback=True, 3224 ) 3225 3226 raise e 3227 3228 if cached_prompt.is_expired(): 3229 langfuse_logger.debug(f"Stale prompt '{cache_key}' found in cache.") 3230 try: 3231 # refresh prompt in background thread, refresh_prompt deduplicates tasks 3232 langfuse_logger.debug(f"Refreshing prompt '{cache_key}' in background.") 3233 3234 def refresh_task() -> None: 3235 self._fetch_prompt_and_update_cache( 3236 name, 3237 version=version, 3238 label=label, 3239 ttl_seconds=cache_ttl_seconds, 3240 max_retries=bounded_max_retries, 3241 fetch_timeout_seconds=fetch_timeout_seconds, 3242 ) 3243 3244 self._resources.prompt_cache.add_refresh_prompt_task( 3245 cache_key, 3246 refresh_task, 3247 ) 3248 langfuse_logger.debug( 3249 f"Returning stale prompt '{cache_key}' from cache." 3250 ) 3251 # return stale prompt 3252 return cached_prompt.value 3253 3254 except Exception as e: 3255 langfuse_logger.warning( 3256 f"Error when refreshing cached prompt '{cache_key}', returning cached version. Error: {e}" 3257 ) 3258 # creation of refresh prompt task failed, return stale prompt 3259 return cached_prompt.value 3260 3261 return cached_prompt.value 3262 3263 def _fetch_prompt_and_update_cache( 3264 self, 3265 name: str, 3266 *, 3267 version: Optional[int] = None, 3268 label: Optional[str] = None, 3269 ttl_seconds: Optional[int] = None, 3270 max_retries: int, 3271 fetch_timeout_seconds: Optional[int], 3272 ) -> PromptClient: 3273 cache_key = PromptCache.generate_cache_key(name, version=version, label=label) 3274 langfuse_logger.debug(f"Fetching prompt '{cache_key}' from server...") 3275 3276 try: 3277 3278 @backoff.on_exception( 3279 backoff.constant, Exception, max_tries=max_retries + 1, logger=None 3280 ) 3281 def fetch_prompts() -> Any: 3282 return self.api.prompts.get( 3283 self._url_encode(name), 3284 version=version, 3285 label=label, 3286 request_options={ 3287 "timeout_in_seconds": fetch_timeout_seconds, 3288 } 3289 if fetch_timeout_seconds is not None 3290 else None, 3291 ) 3292 3293 prompt_response = fetch_prompts() 3294 3295 prompt: PromptClient 3296 if prompt_response.type == "chat": 3297 prompt = ChatPromptClient(prompt_response) 3298 else: 3299 prompt = TextPromptClient(prompt_response) 3300 3301 if self._resources is not None: 3302 self._resources.prompt_cache.set(cache_key, prompt, ttl_seconds) 3303 3304 return prompt 3305 3306 except Exception as e: 3307 langfuse_logger.error( 3308 f"Error while fetching prompt '{cache_key}': {str(e)}" 3309 ) 3310 raise e 3311 3312 def _get_bounded_max_retries( 3313 self, 3314 max_retries: Optional[int], 3315 *, 3316 default_max_retries: int = 2, 3317 max_retries_upper_bound: int = 4, 3318 ) -> int: 3319 if max_retries is None: 3320 return default_max_retries 3321 3322 bounded_max_retries = min( 3323 max(max_retries, 0), 3324 max_retries_upper_bound, 3325 ) 3326 3327 return bounded_max_retries 3328 3329 @overload 3330 def create_prompt( 3331 self, 3332 *, 3333 name: str, 3334 prompt: List[Union[ChatMessageDict, ChatMessageWithPlaceholdersDict]], 3335 labels: List[str] = [], 3336 tags: Optional[List[str]] = None, 3337 type: Optional[Literal["chat"]], 3338 config: Optional[Any] = None, 3339 commit_message: Optional[str] = None, 3340 ) -> ChatPromptClient: ... 3341 3342 @overload 3343 def create_prompt( 3344 self, 3345 *, 3346 name: str, 3347 prompt: str, 3348 labels: List[str] = [], 3349 tags: Optional[List[str]] = None, 3350 type: Optional[Literal["text"]] = "text", 3351 config: Optional[Any] = None, 3352 commit_message: Optional[str] = None, 3353 ) -> TextPromptClient: ... 3354 3355 def create_prompt( 3356 self, 3357 *, 3358 name: str, 3359 prompt: Union[ 3360 str, List[Union[ChatMessageDict, ChatMessageWithPlaceholdersDict]] 3361 ], 3362 labels: List[str] = [], 3363 tags: Optional[List[str]] = None, 3364 type: Optional[Literal["chat", "text"]] = "text", 3365 config: Optional[Any] = None, 3366 commit_message: Optional[str] = None, 3367 ) -> PromptClient: 3368 """Create a new prompt in Langfuse. 3369 3370 Keyword Args: 3371 name : The name of the prompt to be created. 3372 prompt : The content of the prompt to be created. 3373 is_active [DEPRECATED] : A flag indicating whether the prompt is active or not. This is deprecated and will be removed in a future release. Please use the 'production' label instead. 3374 labels: The labels of the prompt. Defaults to None. To create a default-served prompt, add the 'production' label. 3375 tags: The tags of the prompt. Defaults to None. Will be applied to all versions of the prompt. 3376 config: Additional structured data to be saved with the prompt. Defaults to None. 3377 type: The type of the prompt to be created. "chat" vs. "text". Defaults to "text". 3378 commit_message: Optional string describing the change. 3379 3380 Returns: 3381 TextPromptClient: The prompt if type argument is 'text'. 3382 ChatPromptClient: The prompt if type argument is 'chat'. 3383 """ 3384 try: 3385 langfuse_logger.debug(f"Creating prompt {name=}, {labels=}") 3386 3387 if type == "chat": 3388 if not isinstance(prompt, list): 3389 raise ValueError( 3390 "For 'chat' type, 'prompt' must be a list of chat messages with role and content attributes." 3391 ) 3392 request: Union[CreatePromptRequest_Chat, CreatePromptRequest_Text] = ( 3393 CreatePromptRequest_Chat( 3394 name=name, 3395 prompt=cast(Any, prompt), 3396 labels=labels, 3397 tags=tags, 3398 config=config or {}, 3399 commitMessage=commit_message, 3400 type="chat", 3401 ) 3402 ) 3403 server_prompt = self.api.prompts.create(request=request) 3404 3405 if self._resources is not None: 3406 self._resources.prompt_cache.invalidate(name) 3407 3408 return ChatPromptClient(prompt=cast(Prompt_Chat, server_prompt)) 3409 3410 if not isinstance(prompt, str): 3411 raise ValueError("For 'text' type, 'prompt' must be a string.") 3412 3413 request = CreatePromptRequest_Text( 3414 name=name, 3415 prompt=prompt, 3416 labels=labels, 3417 tags=tags, 3418 config=config or {}, 3419 commitMessage=commit_message, 3420 type="text", 3421 ) 3422 3423 server_prompt = self.api.prompts.create(request=request) 3424 3425 if self._resources is not None: 3426 self._resources.prompt_cache.invalidate(name) 3427 3428 return TextPromptClient(prompt=cast(Prompt_Text, server_prompt)) 3429 3430 except Error as e: 3431 handle_fern_exception(e) 3432 raise e 3433 3434 def update_prompt( 3435 self, 3436 *, 3437 name: str, 3438 version: int, 3439 new_labels: List[str] = [], 3440 ) -> Any: 3441 """Update an existing prompt version in Langfuse. The Langfuse SDK prompt cache is invalidated for all prompts witht he specified name. 3442 3443 Args: 3444 name (str): The name of the prompt to update. 3445 version (int): The version number of the prompt to update. 3446 new_labels (List[str], optional): New labels to assign to the prompt version. Labels are unique across versions. The "latest" label is reserved and managed by Langfuse. Defaults to []. 3447 3448 Returns: 3449 Prompt: The updated prompt from the Langfuse API. 3450 3451 """ 3452 updated_prompt = self.api.prompt_version.update( 3453 name=self._url_encode(name), 3454 version=version, 3455 new_labels=new_labels, 3456 ) 3457 3458 if self._resources is not None: 3459 self._resources.prompt_cache.invalidate(name) 3460 3461 return updated_prompt 3462 3463 def _url_encode(self, url: str, *, is_url_param: Optional[bool] = False) -> str: 3464 # httpx ≥ 0.28 does its own WHATWG-compliant quoting (eg. encodes bare 3465 # “%”, “?”, “#”, “|”, … in query/path parts). Re-quoting here would 3466 # double-encode, so we skip when the value is about to be sent straight 3467 # to httpx (`is_url_param=True`) and the installed version is ≥ 0.28. 3468 if is_url_param and Version(httpx.__version__) >= Version("0.28.0"): 3469 return url 3470 3471 # urllib.parse.quote does not escape slashes "/" by default; we need to add safe="" to force escaping 3472 # we need add safe="" to force escaping of slashes 3473 # This is necessary for prompts in prompt folders 3474 return urllib.parse.quote(url, safe="") 3475 3476 def clear_prompt_cache(self) -> None: 3477 """Clear the entire prompt cache, removing all cached prompts. 3478 3479 This method is useful when you want to force a complete refresh of all 3480 cached prompts, for example after major updates or when you need to 3481 ensure the latest versions are fetched from the server. 3482 """ 3483 if self._resources is not None: 3484 self._resources.prompt_cache.clear()
Main client for Langfuse tracing and platform features.
This class provides an interface for creating and managing traces, spans, and generations in Langfuse as well as interacting with the Langfuse API.
The client features a thread-safe singleton pattern for each unique public API key, ensuring consistent trace context propagation across your application. It implements efficient batching of spans with configurable flush settings and includes background thread management for media uploads and score ingestion.
Configuration is flexible through either direct parameters or environment variables, with graceful fallbacks and runtime configuration updates.
Attributes:
- api: Synchronous API client for Langfuse backend communication
- async_api: Asynchronous API client for Langfuse backend communication
- _otel_tracer: Internal LangfuseTracer instance managing OpenTelemetry components
Arguments:
- public_key (Optional[str]): Your Langfuse public API key. Can also be set via LANGFUSE_PUBLIC_KEY environment variable.
- secret_key (Optional[str]): Your Langfuse secret API key. Can also be set via LANGFUSE_SECRET_KEY environment variable.
- base_url (Optional[str]): The Langfuse API base URL. Defaults to "https://cloud.langfuse.com". Can also be set via LANGFUSE_BASE_URL environment variable.
- host (Optional[str]): Deprecated. Use base_url instead. The Langfuse API host URL. Defaults to "https://cloud.langfuse.com".
- timeout (Optional[int]): Timeout in seconds for API requests. Defaults to 5 seconds.
- httpx_client (Optional[httpx.Client]): Custom httpx client for making non-tracing HTTP requests. If not provided, a default client will be created.
- debug (bool): Enable debug logging. Defaults to False. Can also be set via LANGFUSE_DEBUG environment variable.
- tracing_enabled (Optional[bool]): Enable or disable tracing. Defaults to True. Can also be set via LANGFUSE_TRACING_ENABLED environment variable.
- flush_at (Optional[int]): Number of spans to batch before sending to the API. Defaults to 512. Can also be set via LANGFUSE_FLUSH_AT environment variable.
- flush_interval (Optional[float]): Time in seconds between batch flushes. Defaults to 5 seconds. Can also be set via LANGFUSE_FLUSH_INTERVAL environment variable.
- environment (Optional[str]): Environment name for tracing. Default is 'default'. Can also be set via LANGFUSE_TRACING_ENVIRONMENT environment variable. Can be any lowercase alphanumeric string with hyphens and underscores that does not start with 'langfuse'.
- release (Optional[str]): Release version/hash of your application. Used for grouping analytics by release.
- media_upload_thread_count (Optional[int]): Number of background threads for handling media uploads. Defaults to 1. Can also be set via LANGFUSE_MEDIA_UPLOAD_THREAD_COUNT environment variable.
- sample_rate (Optional[float]): Sampling rate for traces (0.0 to 1.0). Defaults to 1.0 (100% of traces are sampled). Can also be set via LANGFUSE_SAMPLE_RATE environment variable.
- mask (Optional[MaskFunction]): Function to mask sensitive data in traces before sending to the API.
- blocked_instrumentation_scopes (Optional[List[str]]): List of instrumentation scope names to block from being exported to Langfuse. Spans from these scopes will be filtered out before being sent to the API. Useful for filtering out spans from specific libraries or frameworks. For exported spans, you can see the instrumentation scope name in the span metadata in Langfuse (
metadata.scope.name) - additional_headers (Optional[Dict[str, str]]): Additional headers to include in all API requests and OTLPSpanExporter requests. These headers will be merged with default headers. Note: If httpx_client is provided, additional_headers must be set directly on your custom httpx_client as well.
- tracer_provider(Optional[TracerProvider]): OpenTelemetry TracerProvider to use for Langfuse. This can be useful to set to have disconnected tracing between Langfuse and other OpenTelemetry-span emitting libraries. Note: To track active spans, the context is still shared between TracerProviders. This may lead to broken trace trees.
Example:
from langfuse.otel import Langfuse # Initialize the client (reads from env vars if not provided) langfuse = Langfuse( public_key="your-public-key", secret_key="your-secret-key", host="https://cloud.langfuse.com", # Optional, default shown ) # Create a trace span with langfuse.start_as_current_span(name="process-query") as span: # Your application code here # Create a nested generation span for an LLM call with span.start_as_current_generation( name="generate-response", model="gpt-4", input={"query": "Tell me about AI"}, model_parameters={"temperature": 0.7, "max_tokens": 500} ) as generation: # Generate response here response = "AI is a field of computer science..." generation.update( output=response, usage_details={"prompt_tokens": 10, "completion_tokens": 50}, cost_details={"total_cost": 0.0023} ) # Score the generation (supports NUMERIC, BOOLEAN, CATEGORICAL) generation.score(name="relevance", value=0.95, data_type="NUMERIC")
200 def __init__( 201 self, 202 *, 203 public_key: Optional[str] = None, 204 secret_key: Optional[str] = None, 205 base_url: Optional[str] = None, 206 host: Optional[str] = None, 207 timeout: Optional[int] = None, 208 httpx_client: Optional[httpx.Client] = None, 209 debug: bool = False, 210 tracing_enabled: Optional[bool] = True, 211 flush_at: Optional[int] = None, 212 flush_interval: Optional[float] = None, 213 environment: Optional[str] = None, 214 release: Optional[str] = None, 215 media_upload_thread_count: Optional[int] = None, 216 sample_rate: Optional[float] = None, 217 mask: Optional[MaskFunction] = None, 218 blocked_instrumentation_scopes: Optional[List[str]] = None, 219 additional_headers: Optional[Dict[str, str]] = None, 220 tracer_provider: Optional[TracerProvider] = None, 221 ): 222 self._base_url = ( 223 base_url 224 or os.environ.get(LANGFUSE_BASE_URL) 225 or host 226 or os.environ.get(LANGFUSE_HOST, "https://cloud.langfuse.com") 227 ) 228 self._environment = environment or cast( 229 str, os.environ.get(LANGFUSE_TRACING_ENVIRONMENT) 230 ) 231 self._project_id: Optional[str] = None 232 sample_rate = sample_rate or float(os.environ.get(LANGFUSE_SAMPLE_RATE, 1.0)) 233 if not 0.0 <= sample_rate <= 1.0: 234 raise ValueError( 235 f"Sample rate must be between 0.0 and 1.0, got {sample_rate}" 236 ) 237 238 timeout = timeout or int(os.environ.get(LANGFUSE_TIMEOUT, 5)) 239 240 self._tracing_enabled = ( 241 tracing_enabled 242 and os.environ.get(LANGFUSE_TRACING_ENABLED, "true").lower() != "false" 243 ) 244 if not self._tracing_enabled: 245 langfuse_logger.info( 246 "Configuration: Langfuse tracing is explicitly disabled. No data will be sent to the Langfuse API." 247 ) 248 249 debug = ( 250 debug if debug else (os.getenv(LANGFUSE_DEBUG, "false").lower() == "true") 251 ) 252 if debug: 253 logging.basicConfig( 254 format="%(asctime)s - %(name)s - %(levelname)s - %(message)s" 255 ) 256 langfuse_logger.setLevel(logging.DEBUG) 257 258 public_key = public_key or os.environ.get(LANGFUSE_PUBLIC_KEY) 259 if public_key is None: 260 langfuse_logger.warning( 261 "Authentication error: Langfuse client initialized without public_key. Client will be disabled. " 262 "Provide a public_key parameter or set LANGFUSE_PUBLIC_KEY environment variable. " 263 ) 264 self._otel_tracer = otel_trace_api.NoOpTracer() 265 return 266 267 secret_key = secret_key or os.environ.get(LANGFUSE_SECRET_KEY) 268 if secret_key is None: 269 langfuse_logger.warning( 270 "Authentication error: Langfuse client initialized without secret_key. Client will be disabled. " 271 "Provide a secret_key parameter or set LANGFUSE_SECRET_KEY environment variable. " 272 ) 273 self._otel_tracer = otel_trace_api.NoOpTracer() 274 return 275 276 if os.environ.get("OTEL_SDK_DISABLED", "false").lower() == "true": 277 langfuse_logger.warning( 278 "OTEL_SDK_DISABLED is set. Langfuse tracing will be disabled and no traces will appear in the UI." 279 ) 280 281 # Initialize api and tracer if requirements are met 282 self._resources = LangfuseResourceManager( 283 public_key=public_key, 284 secret_key=secret_key, 285 base_url=self._base_url, 286 timeout=timeout, 287 environment=self._environment, 288 release=release, 289 flush_at=flush_at, 290 flush_interval=flush_interval, 291 httpx_client=httpx_client, 292 media_upload_thread_count=media_upload_thread_count, 293 sample_rate=sample_rate, 294 mask=mask, 295 tracing_enabled=self._tracing_enabled, 296 blocked_instrumentation_scopes=blocked_instrumentation_scopes, 297 additional_headers=additional_headers, 298 tracer_provider=tracer_provider, 299 ) 300 self._mask = self._resources.mask 301 302 self._otel_tracer = ( 303 self._resources.tracer 304 if self._tracing_enabled and self._resources.tracer is not None 305 else otel_trace_api.NoOpTracer() 306 ) 307 self.api = self._resources.api 308 self.async_api = self._resources.async_api
310 def start_span( 311 self, 312 *, 313 trace_context: Optional[TraceContext] = None, 314 name: str, 315 input: Optional[Any] = None, 316 output: Optional[Any] = None, 317 metadata: Optional[Any] = None, 318 version: Optional[str] = None, 319 level: Optional[SpanLevel] = None, 320 status_message: Optional[str] = None, 321 ) -> LangfuseSpan: 322 """Create a new span for tracing a unit of work. 323 324 This method creates a new span but does not set it as the current span in the 325 context. To create and use a span within a context, use start_as_current_span(). 326 327 The created span will be the child of the current span in the context. 328 329 Args: 330 trace_context: Optional context for connecting to an existing trace 331 name: Name of the span (e.g., function or operation name) 332 input: Input data for the operation (can be any JSON-serializable object) 333 output: Output data from the operation (can be any JSON-serializable object) 334 metadata: Additional metadata to associate with the span 335 version: Version identifier for the code or component 336 level: Importance level of the span (info, warning, error) 337 status_message: Optional status message for the span 338 339 Returns: 340 A LangfuseSpan object that must be ended with .end() when the operation completes 341 342 Example: 343 ```python 344 span = langfuse.start_span(name="process-data") 345 try: 346 # Do work 347 span.update(output="result") 348 finally: 349 span.end() 350 ``` 351 """ 352 return self.start_observation( 353 trace_context=trace_context, 354 name=name, 355 as_type="span", 356 input=input, 357 output=output, 358 metadata=metadata, 359 version=version, 360 level=level, 361 status_message=status_message, 362 )
Create a new span for tracing a unit of work.
This method creates a new span but does not set it as the current span in the context. To create and use a span within a context, use start_as_current_span().
The created span will be the child of the current span in the context.
Arguments:
- trace_context: Optional context for connecting to an existing trace
- name: Name of the span (e.g., function or operation name)
- input: Input data for the operation (can be any JSON-serializable object)
- output: Output data from the operation (can be any JSON-serializable object)
- metadata: Additional metadata to associate with the span
- version: Version identifier for the code or component
- level: Importance level of the span (info, warning, error)
- status_message: Optional status message for the span
Returns:
A LangfuseSpan object that must be ended with .end() when the operation completes
Example:
span = langfuse.start_span(name="process-data") try: # Do work span.update(output="result") finally: span.end()
364 def start_as_current_span( 365 self, 366 *, 367 trace_context: Optional[TraceContext] = None, 368 name: str, 369 input: Optional[Any] = None, 370 output: Optional[Any] = None, 371 metadata: Optional[Any] = None, 372 version: Optional[str] = None, 373 level: Optional[SpanLevel] = None, 374 status_message: Optional[str] = None, 375 end_on_exit: Optional[bool] = None, 376 ) -> _AgnosticContextManager[LangfuseSpan]: 377 """Create a new span and set it as the current span in a context manager. 378 379 This method creates a new span and sets it as the current span within a context 380 manager. Use this method with a 'with' statement to automatically handle span 381 lifecycle within a code block. 382 383 The created span will be the child of the current span in the context. 384 385 Args: 386 trace_context: Optional context for connecting to an existing trace 387 name: Name of the span (e.g., function or operation name) 388 input: Input data for the operation (can be any JSON-serializable object) 389 output: Output data from the operation (can be any JSON-serializable object) 390 metadata: Additional metadata to associate with the span 391 version: Version identifier for the code or component 392 level: Importance level of the span (info, warning, error) 393 status_message: Optional status message for the span 394 end_on_exit (default: True): Whether to end the span automatically when leaving the context manager. If False, the span must be manually ended to avoid memory leaks. 395 396 Returns: 397 A context manager that yields a LangfuseSpan 398 399 Example: 400 ```python 401 with langfuse.start_as_current_span(name="process-query") as span: 402 # Do work 403 result = process_data() 404 span.update(output=result) 405 406 # Create a child span automatically 407 with span.start_as_current_span(name="sub-operation") as child_span: 408 # Do sub-operation work 409 child_span.update(output="sub-result") 410 ``` 411 """ 412 return self.start_as_current_observation( 413 trace_context=trace_context, 414 name=name, 415 as_type="span", 416 input=input, 417 output=output, 418 metadata=metadata, 419 version=version, 420 level=level, 421 status_message=status_message, 422 end_on_exit=end_on_exit, 423 )
Create a new span and set it as the current span in a context manager.
This method creates a new span and sets it as the current span within a context manager. Use this method with a 'with' statement to automatically handle span lifecycle within a code block.
The created span will be the child of the current span in the context.
Arguments:
- trace_context: Optional context for connecting to an existing trace
- name: Name of the span (e.g., function or operation name)
- input: Input data for the operation (can be any JSON-serializable object)
- output: Output data from the operation (can be any JSON-serializable object)
- metadata: Additional metadata to associate with the span
- version: Version identifier for the code or component
- level: Importance level of the span (info, warning, error)
- status_message: Optional status message for the span
- end_on_exit (default: True): Whether to end the span automatically when leaving the context manager. If False, the span must be manually ended to avoid memory leaks.
Returns:
A context manager that yields a LangfuseSpan
Example:
with langfuse.start_as_current_span(name="process-query") as span: # Do work result = process_data() span.update(output=result) # Create a child span automatically with span.start_as_current_span(name="sub-operation") as child_span: # Do sub-operation work child_span.update(output="sub-result")
572 def start_observation( 573 self, 574 *, 575 trace_context: Optional[TraceContext] = None, 576 name: str, 577 as_type: ObservationTypeLiteralNoEvent = "span", 578 input: Optional[Any] = None, 579 output: Optional[Any] = None, 580 metadata: Optional[Any] = None, 581 version: Optional[str] = None, 582 level: Optional[SpanLevel] = None, 583 status_message: Optional[str] = None, 584 completion_start_time: Optional[datetime] = None, 585 model: Optional[str] = None, 586 model_parameters: Optional[Dict[str, MapValue]] = None, 587 usage_details: Optional[Dict[str, int]] = None, 588 cost_details: Optional[Dict[str, float]] = None, 589 prompt: Optional[PromptClient] = None, 590 ) -> Union[ 591 LangfuseSpan, 592 LangfuseGeneration, 593 LangfuseAgent, 594 LangfuseTool, 595 LangfuseChain, 596 LangfuseRetriever, 597 LangfuseEvaluator, 598 LangfuseEmbedding, 599 LangfuseGuardrail, 600 ]: 601 """Create a new observation of the specified type. 602 603 This method creates a new observation but does not set it as the current span in the 604 context. To create and use an observation within a context, use start_as_current_observation(). 605 606 Args: 607 trace_context: Optional context for connecting to an existing trace 608 name: Name of the observation 609 as_type: Type of observation to create (defaults to "span") 610 input: Input data for the operation 611 output: Output data from the operation 612 metadata: Additional metadata to associate with the observation 613 version: Version identifier for the code or component 614 level: Importance level of the observation 615 status_message: Optional status message for the observation 616 completion_start_time: When the model started generating (for generation types) 617 model: Name/identifier of the AI model used (for generation types) 618 model_parameters: Parameters used for the model (for generation types) 619 usage_details: Token usage information (for generation types) 620 cost_details: Cost information (for generation types) 621 prompt: Associated prompt template (for generation types) 622 623 Returns: 624 An observation object of the appropriate type that must be ended with .end() 625 """ 626 if trace_context: 627 trace_id = trace_context.get("trace_id", None) 628 parent_span_id = trace_context.get("parent_span_id", None) 629 630 if trace_id: 631 remote_parent_span = self._create_remote_parent_span( 632 trace_id=trace_id, parent_span_id=parent_span_id 633 ) 634 635 with otel_trace_api.use_span( 636 cast(otel_trace_api.Span, remote_parent_span) 637 ): 638 otel_span = self._otel_tracer.start_span(name=name) 639 otel_span.set_attribute(LangfuseOtelSpanAttributes.AS_ROOT, True) 640 641 return self._create_observation_from_otel_span( 642 otel_span=otel_span, 643 as_type=as_type, 644 input=input, 645 output=output, 646 metadata=metadata, 647 version=version, 648 level=level, 649 status_message=status_message, 650 completion_start_time=completion_start_time, 651 model=model, 652 model_parameters=model_parameters, 653 usage_details=usage_details, 654 cost_details=cost_details, 655 prompt=prompt, 656 ) 657 658 otel_span = self._otel_tracer.start_span(name=name) 659 660 return self._create_observation_from_otel_span( 661 otel_span=otel_span, 662 as_type=as_type, 663 input=input, 664 output=output, 665 metadata=metadata, 666 version=version, 667 level=level, 668 status_message=status_message, 669 completion_start_time=completion_start_time, 670 model=model, 671 model_parameters=model_parameters, 672 usage_details=usage_details, 673 cost_details=cost_details, 674 prompt=prompt, 675 )
Create a new observation of the specified type.
This method creates a new observation but does not set it as the current span in the context. To create and use an observation within a context, use start_as_current_observation().
Arguments:
- trace_context: Optional context for connecting to an existing trace
- name: Name of the observation
- as_type: Type of observation to create (defaults to "span")
- input: Input data for the operation
- output: Output data from the operation
- metadata: Additional metadata to associate with the observation
- version: Version identifier for the code or component
- level: Importance level of the observation
- status_message: Optional status message for the observation
- completion_start_time: When the model started generating (for generation types)
- model: Name/identifier of the AI model used (for generation types)
- model_parameters: Parameters used for the model (for generation types)
- usage_details: Token usage information (for generation types)
- cost_details: Cost information (for generation types)
- prompt: Associated prompt template (for generation types)
Returns:
An observation object of the appropriate type that must be ended with .end()
747 def start_generation( 748 self, 749 *, 750 trace_context: Optional[TraceContext] = None, 751 name: str, 752 input: Optional[Any] = None, 753 output: Optional[Any] = None, 754 metadata: Optional[Any] = None, 755 version: Optional[str] = None, 756 level: Optional[SpanLevel] = None, 757 status_message: Optional[str] = None, 758 completion_start_time: Optional[datetime] = None, 759 model: Optional[str] = None, 760 model_parameters: Optional[Dict[str, MapValue]] = None, 761 usage_details: Optional[Dict[str, int]] = None, 762 cost_details: Optional[Dict[str, float]] = None, 763 prompt: Optional[PromptClient] = None, 764 ) -> LangfuseGeneration: 765 """Create a new generation span for model generations. 766 767 DEPRECATED: This method is deprecated and will be removed in a future version. 768 Use start_observation(as_type='generation') instead. 769 770 This method creates a specialized span for tracking model generations. 771 It includes additional fields specific to model generations such as model name, 772 token usage, and cost details. 773 774 The created generation span will be the child of the current span in the context. 775 776 Args: 777 trace_context: Optional context for connecting to an existing trace 778 name: Name of the generation operation 779 input: Input data for the model (e.g., prompts) 780 output: Output from the model (e.g., completions) 781 metadata: Additional metadata to associate with the generation 782 version: Version identifier for the model or component 783 level: Importance level of the generation (info, warning, error) 784 status_message: Optional status message for the generation 785 completion_start_time: When the model started generating the response 786 model: Name/identifier of the AI model used (e.g., "gpt-4") 787 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 788 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 789 cost_details: Cost information for the model call 790 prompt: Associated prompt template from Langfuse prompt management 791 792 Returns: 793 A LangfuseGeneration object that must be ended with .end() when complete 794 795 Example: 796 ```python 797 generation = langfuse.start_generation( 798 name="answer-generation", 799 model="gpt-4", 800 input={"prompt": "Explain quantum computing"}, 801 model_parameters={"temperature": 0.7} 802 ) 803 try: 804 # Call model API 805 response = llm.generate(...) 806 807 generation.update( 808 output=response.text, 809 usage_details={ 810 "prompt_tokens": response.usage.prompt_tokens, 811 "completion_tokens": response.usage.completion_tokens 812 } 813 ) 814 finally: 815 generation.end() 816 ``` 817 """ 818 warnings.warn( 819 "start_generation is deprecated and will be removed in a future version. " 820 "Use start_observation(as_type='generation') instead.", 821 DeprecationWarning, 822 stacklevel=2, 823 ) 824 return self.start_observation( 825 trace_context=trace_context, 826 name=name, 827 as_type="generation", 828 input=input, 829 output=output, 830 metadata=metadata, 831 version=version, 832 level=level, 833 status_message=status_message, 834 completion_start_time=completion_start_time, 835 model=model, 836 model_parameters=model_parameters, 837 usage_details=usage_details, 838 cost_details=cost_details, 839 prompt=prompt, 840 )
Create a new generation span for model generations.
DEPRECATED: This method is deprecated and will be removed in a future version. Use start_observation(as_type='generation') instead.
This method creates a specialized span for tracking model generations. It includes additional fields specific to model generations such as model name, token usage, and cost details.
The created generation span will be the child of the current span in the context.
Arguments:
- trace_context: Optional context for connecting to an existing trace
- name: Name of the generation operation
- input: Input data for the model (e.g., prompts)
- output: Output from the model (e.g., completions)
- metadata: Additional metadata to associate with the generation
- version: Version identifier for the model or component
- level: Importance level of the generation (info, warning, error)
- status_message: Optional status message for the generation
- completion_start_time: When the model started generating the response
- model: Name/identifier of the AI model used (e.g., "gpt-4")
- model_parameters: Parameters used for the model (e.g., temperature, max_tokens)
- usage_details: Token usage information (e.g., prompt_tokens, completion_tokens)
- cost_details: Cost information for the model call
- prompt: Associated prompt template from Langfuse prompt management
Returns:
A LangfuseGeneration object that must be ended with .end() when complete
Example:
generation = langfuse.start_generation( name="answer-generation", model="gpt-4", input={"prompt": "Explain quantum computing"}, model_parameters={"temperature": 0.7} ) try: # Call model API response = llm.generate(...) generation.update( output=response.text, usage_details={ "prompt_tokens": response.usage.prompt_tokens, "completion_tokens": response.usage.completion_tokens } ) finally: generation.end()
842 def start_as_current_generation( 843 self, 844 *, 845 trace_context: Optional[TraceContext] = None, 846 name: str, 847 input: Optional[Any] = None, 848 output: Optional[Any] = None, 849 metadata: Optional[Any] = None, 850 version: Optional[str] = None, 851 level: Optional[SpanLevel] = None, 852 status_message: Optional[str] = None, 853 completion_start_time: Optional[datetime] = None, 854 model: Optional[str] = None, 855 model_parameters: Optional[Dict[str, MapValue]] = None, 856 usage_details: Optional[Dict[str, int]] = None, 857 cost_details: Optional[Dict[str, float]] = None, 858 prompt: Optional[PromptClient] = None, 859 end_on_exit: Optional[bool] = None, 860 ) -> _AgnosticContextManager[LangfuseGeneration]: 861 """Create a new generation span and set it as the current span in a context manager. 862 863 DEPRECATED: This method is deprecated and will be removed in a future version. 864 Use start_as_current_observation(as_type='generation') instead. 865 866 This method creates a specialized span for model generations and sets it as the 867 current span within a context manager. Use this method with a 'with' statement to 868 automatically handle the generation span lifecycle within a code block. 869 870 The created generation span will be the child of the current span in the context. 871 872 Args: 873 trace_context: Optional context for connecting to an existing trace 874 name: Name of the generation operation 875 input: Input data for the model (e.g., prompts) 876 output: Output from the model (e.g., completions) 877 metadata: Additional metadata to associate with the generation 878 version: Version identifier for the model or component 879 level: Importance level of the generation (info, warning, error) 880 status_message: Optional status message for the generation 881 completion_start_time: When the model started generating the response 882 model: Name/identifier of the AI model used (e.g., "gpt-4") 883 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 884 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 885 cost_details: Cost information for the model call 886 prompt: Associated prompt template from Langfuse prompt management 887 end_on_exit (default: True): Whether to end the span automatically when leaving the context manager. If False, the span must be manually ended to avoid memory leaks. 888 889 Returns: 890 A context manager that yields a LangfuseGeneration 891 892 Example: 893 ```python 894 with langfuse.start_as_current_generation( 895 name="answer-generation", 896 model="gpt-4", 897 input={"prompt": "Explain quantum computing"} 898 ) as generation: 899 # Call model API 900 response = llm.generate(...) 901 902 # Update with results 903 generation.update( 904 output=response.text, 905 usage_details={ 906 "prompt_tokens": response.usage.prompt_tokens, 907 "completion_tokens": response.usage.completion_tokens 908 } 909 ) 910 ``` 911 """ 912 warnings.warn( 913 "start_as_current_generation is deprecated and will be removed in a future version. " 914 "Use start_as_current_observation(as_type='generation') instead.", 915 DeprecationWarning, 916 stacklevel=2, 917 ) 918 return self.start_as_current_observation( 919 trace_context=trace_context, 920 name=name, 921 as_type="generation", 922 input=input, 923 output=output, 924 metadata=metadata, 925 version=version, 926 level=level, 927 status_message=status_message, 928 completion_start_time=completion_start_time, 929 model=model, 930 model_parameters=model_parameters, 931 usage_details=usage_details, 932 cost_details=cost_details, 933 prompt=prompt, 934 end_on_exit=end_on_exit, 935 )
Create a new generation span and set it as the current span in a context manager.
DEPRECATED: This method is deprecated and will be removed in a future version. Use start_as_current_observation(as_type='generation') instead.
This method creates a specialized span for model generations and sets it as the current span within a context manager. Use this method with a 'with' statement to automatically handle the generation span lifecycle within a code block.
The created generation span will be the child of the current span in the context.
Arguments:
- trace_context: Optional context for connecting to an existing trace
- name: Name of the generation operation
- input: Input data for the model (e.g., prompts)
- output: Output from the model (e.g., completions)
- metadata: Additional metadata to associate with the generation
- version: Version identifier for the model or component
- level: Importance level of the generation (info, warning, error)
- status_message: Optional status message for the generation
- completion_start_time: When the model started generating the response
- model: Name/identifier of the AI model used (e.g., "gpt-4")
- model_parameters: Parameters used for the model (e.g., temperature, max_tokens)
- usage_details: Token usage information (e.g., prompt_tokens, completion_tokens)
- cost_details: Cost information for the model call
- prompt: Associated prompt template from Langfuse prompt management
- end_on_exit (default: True): Whether to end the span automatically when leaving the context manager. If False, the span must be manually ended to avoid memory leaks.
Returns:
A context manager that yields a LangfuseGeneration
Example:
with langfuse.start_as_current_generation( name="answer-generation", model="gpt-4", input={"prompt": "Explain quantum computing"} ) as generation: # Call model API response = llm.generate(...) # Update with results generation.update( output=response.text, usage_details={ "prompt_tokens": response.usage.prompt_tokens, "completion_tokens": response.usage.completion_tokens } )
1093 def start_as_current_observation( 1094 self, 1095 *, 1096 trace_context: Optional[TraceContext] = None, 1097 name: str, 1098 as_type: ObservationTypeLiteralNoEvent = "span", 1099 input: Optional[Any] = None, 1100 output: Optional[Any] = None, 1101 metadata: Optional[Any] = None, 1102 version: Optional[str] = None, 1103 level: Optional[SpanLevel] = None, 1104 status_message: Optional[str] = None, 1105 completion_start_time: Optional[datetime] = None, 1106 model: Optional[str] = None, 1107 model_parameters: Optional[Dict[str, MapValue]] = None, 1108 usage_details: Optional[Dict[str, int]] = None, 1109 cost_details: Optional[Dict[str, float]] = None, 1110 prompt: Optional[PromptClient] = None, 1111 end_on_exit: Optional[bool] = None, 1112 ) -> Union[ 1113 _AgnosticContextManager[LangfuseGeneration], 1114 _AgnosticContextManager[LangfuseSpan], 1115 _AgnosticContextManager[LangfuseAgent], 1116 _AgnosticContextManager[LangfuseTool], 1117 _AgnosticContextManager[LangfuseChain], 1118 _AgnosticContextManager[LangfuseRetriever], 1119 _AgnosticContextManager[LangfuseEvaluator], 1120 _AgnosticContextManager[LangfuseEmbedding], 1121 _AgnosticContextManager[LangfuseGuardrail], 1122 ]: 1123 """Create a new observation and set it as the current span in a context manager. 1124 1125 This method creates a new observation of the specified type and sets it as the 1126 current span within a context manager. Use this method with a 'with' statement to 1127 automatically handle the observation lifecycle within a code block. 1128 1129 The created observation will be the child of the current span in the context. 1130 1131 Args: 1132 trace_context: Optional context for connecting to an existing trace 1133 name: Name of the observation (e.g., function or operation name) 1134 as_type: Type of observation to create (defaults to "span") 1135 input: Input data for the operation (can be any JSON-serializable object) 1136 output: Output data from the operation (can be any JSON-serializable object) 1137 metadata: Additional metadata to associate with the observation 1138 version: Version identifier for the code or component 1139 level: Importance level of the observation (info, warning, error) 1140 status_message: Optional status message for the observation 1141 end_on_exit (default: True): Whether to end the span automatically when leaving the context manager. If False, the span must be manually ended to avoid memory leaks. 1142 1143 The following parameters are available when as_type is: "generation" or "embedding". 1144 completion_start_time: When the model started generating the response 1145 model: Name/identifier of the AI model used (e.g., "gpt-4") 1146 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 1147 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 1148 cost_details: Cost information for the model call 1149 prompt: Associated prompt template from Langfuse prompt management 1150 1151 Returns: 1152 A context manager that yields the appropriate observation type based on as_type 1153 1154 Example: 1155 ```python 1156 # Create a span 1157 with langfuse.start_as_current_observation(name="process-query", as_type="span") as span: 1158 # Do work 1159 result = process_data() 1160 span.update(output=result) 1161 1162 # Create a child span automatically 1163 with span.start_as_current_span(name="sub-operation") as child_span: 1164 # Do sub-operation work 1165 child_span.update(output="sub-result") 1166 1167 # Create a tool observation 1168 with langfuse.start_as_current_observation(name="web-search", as_type="tool") as tool: 1169 # Do tool work 1170 results = search_web(query) 1171 tool.update(output=results) 1172 1173 # Create a generation observation 1174 with langfuse.start_as_current_observation( 1175 name="answer-generation", 1176 as_type="generation", 1177 model="gpt-4" 1178 ) as generation: 1179 # Generate answer 1180 response = llm.generate(...) 1181 generation.update(output=response) 1182 ``` 1183 """ 1184 if as_type in get_observation_types_list(ObservationTypeGenerationLike): 1185 if trace_context: 1186 trace_id = trace_context.get("trace_id", None) 1187 parent_span_id = trace_context.get("parent_span_id", None) 1188 1189 if trace_id: 1190 remote_parent_span = self._create_remote_parent_span( 1191 trace_id=trace_id, parent_span_id=parent_span_id 1192 ) 1193 1194 return cast( 1195 Union[ 1196 _AgnosticContextManager[LangfuseGeneration], 1197 _AgnosticContextManager[LangfuseEmbedding], 1198 ], 1199 self._create_span_with_parent_context( 1200 as_type=as_type, 1201 name=name, 1202 remote_parent_span=remote_parent_span, 1203 parent=None, 1204 end_on_exit=end_on_exit, 1205 input=input, 1206 output=output, 1207 metadata=metadata, 1208 version=version, 1209 level=level, 1210 status_message=status_message, 1211 completion_start_time=completion_start_time, 1212 model=model, 1213 model_parameters=model_parameters, 1214 usage_details=usage_details, 1215 cost_details=cost_details, 1216 prompt=prompt, 1217 ), 1218 ) 1219 1220 return cast( 1221 Union[ 1222 _AgnosticContextManager[LangfuseGeneration], 1223 _AgnosticContextManager[LangfuseEmbedding], 1224 ], 1225 self._start_as_current_otel_span_with_processed_media( 1226 as_type=as_type, 1227 name=name, 1228 end_on_exit=end_on_exit, 1229 input=input, 1230 output=output, 1231 metadata=metadata, 1232 version=version, 1233 level=level, 1234 status_message=status_message, 1235 completion_start_time=completion_start_time, 1236 model=model, 1237 model_parameters=model_parameters, 1238 usage_details=usage_details, 1239 cost_details=cost_details, 1240 prompt=prompt, 1241 ), 1242 ) 1243 1244 if as_type in get_observation_types_list(ObservationTypeSpanLike): 1245 if trace_context: 1246 trace_id = trace_context.get("trace_id", None) 1247 parent_span_id = trace_context.get("parent_span_id", None) 1248 1249 if trace_id: 1250 remote_parent_span = self._create_remote_parent_span( 1251 trace_id=trace_id, parent_span_id=parent_span_id 1252 ) 1253 1254 return cast( 1255 Union[ 1256 _AgnosticContextManager[LangfuseSpan], 1257 _AgnosticContextManager[LangfuseAgent], 1258 _AgnosticContextManager[LangfuseTool], 1259 _AgnosticContextManager[LangfuseChain], 1260 _AgnosticContextManager[LangfuseRetriever], 1261 _AgnosticContextManager[LangfuseEvaluator], 1262 _AgnosticContextManager[LangfuseGuardrail], 1263 ], 1264 self._create_span_with_parent_context( 1265 as_type=as_type, 1266 name=name, 1267 remote_parent_span=remote_parent_span, 1268 parent=None, 1269 end_on_exit=end_on_exit, 1270 input=input, 1271 output=output, 1272 metadata=metadata, 1273 version=version, 1274 level=level, 1275 status_message=status_message, 1276 ), 1277 ) 1278 1279 return cast( 1280 Union[ 1281 _AgnosticContextManager[LangfuseSpan], 1282 _AgnosticContextManager[LangfuseAgent], 1283 _AgnosticContextManager[LangfuseTool], 1284 _AgnosticContextManager[LangfuseChain], 1285 _AgnosticContextManager[LangfuseRetriever], 1286 _AgnosticContextManager[LangfuseEvaluator], 1287 _AgnosticContextManager[LangfuseGuardrail], 1288 ], 1289 self._start_as_current_otel_span_with_processed_media( 1290 as_type=as_type, 1291 name=name, 1292 end_on_exit=end_on_exit, 1293 input=input, 1294 output=output, 1295 metadata=metadata, 1296 version=version, 1297 level=level, 1298 status_message=status_message, 1299 ), 1300 ) 1301 1302 # This should never be reached since all valid types are handled above 1303 langfuse_logger.warning( 1304 f"Unknown observation type: {as_type}, falling back to span" 1305 ) 1306 return self._start_as_current_otel_span_with_processed_media( 1307 as_type="span", 1308 name=name, 1309 end_on_exit=end_on_exit, 1310 input=input, 1311 output=output, 1312 metadata=metadata, 1313 version=version, 1314 level=level, 1315 status_message=status_message, 1316 )
Create a new observation and set it as the current span in a context manager.
This method creates a new observation of the specified type and sets it as the current span within a context manager. Use this method with a 'with' statement to automatically handle the observation lifecycle within a code block.
The created observation will be the child of the current span in the context.
Arguments:
- trace_context: Optional context for connecting to an existing trace
- name: Name of the observation (e.g., function or operation name)
- as_type: Type of observation to create (defaults to "span")
- input: Input data for the operation (can be any JSON-serializable object)
- output: Output data from the operation (can be any JSON-serializable object)
- metadata: Additional metadata to associate with the observation
- version: Version identifier for the code or component
- level: Importance level of the observation (info, warning, error)
- status_message: Optional status message for the observation
- end_on_exit (default: True): Whether to end the span automatically when leaving the context manager. If False, the span must be manually ended to avoid memory leaks.
- The following parameters are available when as_type is: "generation" or "embedding".
- completion_start_time: When the model started generating the response
- model: Name/identifier of the AI model used (e.g., "gpt-4")
- model_parameters: Parameters used for the model (e.g., temperature, max_tokens)
- usage_details: Token usage information (e.g., prompt_tokens, completion_tokens)
- cost_details: Cost information for the model call
- prompt: Associated prompt template from Langfuse prompt management
Returns:
A context manager that yields the appropriate observation type based on as_type
Example:
# Create a span with langfuse.start_as_current_observation(name="process-query", as_type="span") as span: # Do work result = process_data() span.update(output=result) # Create a child span automatically with span.start_as_current_span(name="sub-operation") as child_span: # Do sub-operation work child_span.update(output="sub-result") # Create a tool observation with langfuse.start_as_current_observation(name="web-search", as_type="tool") as tool: # Do tool work results = search_web(query) tool.update(output=results) # Create a generation observation with langfuse.start_as_current_observation( name="answer-generation", as_type="generation", model="gpt-4" ) as generation: # Generate answer response = llm.generate(...) generation.update(output=response)
1477 def update_current_generation( 1478 self, 1479 *, 1480 name: Optional[str] = None, 1481 input: Optional[Any] = None, 1482 output: Optional[Any] = None, 1483 metadata: Optional[Any] = None, 1484 version: Optional[str] = None, 1485 level: Optional[SpanLevel] = None, 1486 status_message: Optional[str] = None, 1487 completion_start_time: Optional[datetime] = None, 1488 model: Optional[str] = None, 1489 model_parameters: Optional[Dict[str, MapValue]] = None, 1490 usage_details: Optional[Dict[str, int]] = None, 1491 cost_details: Optional[Dict[str, float]] = None, 1492 prompt: Optional[PromptClient] = None, 1493 ) -> None: 1494 """Update the current active generation span with new information. 1495 1496 This method updates the current generation span in the active context with 1497 additional information. It's useful for adding output, usage stats, or other 1498 details that become available during or after model generation. 1499 1500 Args: 1501 name: The generation name 1502 input: Updated input data for the model 1503 output: Output from the model (e.g., completions) 1504 metadata: Additional metadata to associate with the generation 1505 version: Version identifier for the model or component 1506 level: Importance level of the generation (info, warning, error) 1507 status_message: Optional status message for the generation 1508 completion_start_time: When the model started generating the response 1509 model: Name/identifier of the AI model used (e.g., "gpt-4") 1510 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 1511 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 1512 cost_details: Cost information for the model call 1513 prompt: Associated prompt template from Langfuse prompt management 1514 1515 Example: 1516 ```python 1517 with langfuse.start_as_current_generation(name="answer-query") as generation: 1518 # Initial setup and API call 1519 response = llm.generate(...) 1520 1521 # Update with results that weren't available at creation time 1522 langfuse.update_current_generation( 1523 output=response.text, 1524 usage_details={ 1525 "prompt_tokens": response.usage.prompt_tokens, 1526 "completion_tokens": response.usage.completion_tokens 1527 } 1528 ) 1529 ``` 1530 """ 1531 if not self._tracing_enabled: 1532 langfuse_logger.debug( 1533 "Operation skipped: update_current_generation - Tracing is disabled or client is in no-op mode." 1534 ) 1535 return 1536 1537 current_otel_span = self._get_current_otel_span() 1538 1539 if current_otel_span is not None: 1540 generation = LangfuseGeneration( 1541 otel_span=current_otel_span, langfuse_client=self 1542 ) 1543 1544 if name: 1545 current_otel_span.update_name(name) 1546 1547 generation.update( 1548 input=input, 1549 output=output, 1550 metadata=metadata, 1551 version=version, 1552 level=level, 1553 status_message=status_message, 1554 completion_start_time=completion_start_time, 1555 model=model, 1556 model_parameters=model_parameters, 1557 usage_details=usage_details, 1558 cost_details=cost_details, 1559 prompt=prompt, 1560 )
Update the current active generation span with new information.
This method updates the current generation span in the active context with additional information. It's useful for adding output, usage stats, or other details that become available during or after model generation.
Arguments:
- name: The generation name
- input: Updated input data for the model
- output: Output from the model (e.g., completions)
- metadata: Additional metadata to associate with the generation
- version: Version identifier for the model or component
- level: Importance level of the generation (info, warning, error)
- status_message: Optional status message for the generation
- completion_start_time: When the model started generating the response
- model: Name/identifier of the AI model used (e.g., "gpt-4")
- model_parameters: Parameters used for the model (e.g., temperature, max_tokens)
- usage_details: Token usage information (e.g., prompt_tokens, completion_tokens)
- cost_details: Cost information for the model call
- prompt: Associated prompt template from Langfuse prompt management
Example:
with langfuse.start_as_current_generation(name="answer-query") as generation: # Initial setup and API call response = llm.generate(...) # Update with results that weren't available at creation time langfuse.update_current_generation( output=response.text, usage_details={ "prompt_tokens": response.usage.prompt_tokens, "completion_tokens": response.usage.completion_tokens } )
1562 def update_current_span( 1563 self, 1564 *, 1565 name: Optional[str] = None, 1566 input: Optional[Any] = None, 1567 output: Optional[Any] = None, 1568 metadata: Optional[Any] = None, 1569 version: Optional[str] = None, 1570 level: Optional[SpanLevel] = None, 1571 status_message: Optional[str] = None, 1572 ) -> None: 1573 """Update the current active span with new information. 1574 1575 This method updates the current span in the active context with 1576 additional information. It's useful for adding outputs or metadata 1577 that become available during execution. 1578 1579 Args: 1580 name: The span name 1581 input: Updated input data for the operation 1582 output: Output data from the operation 1583 metadata: Additional metadata to associate with the span 1584 version: Version identifier for the code or component 1585 level: Importance level of the span (info, warning, error) 1586 status_message: Optional status message for the span 1587 1588 Example: 1589 ```python 1590 with langfuse.start_as_current_span(name="process-data") as span: 1591 # Initial processing 1592 result = process_first_part() 1593 1594 # Update with intermediate results 1595 langfuse.update_current_span(metadata={"intermediate_result": result}) 1596 1597 # Continue processing 1598 final_result = process_second_part(result) 1599 1600 # Final update 1601 langfuse.update_current_span(output=final_result) 1602 ``` 1603 """ 1604 if not self._tracing_enabled: 1605 langfuse_logger.debug( 1606 "Operation skipped: update_current_span - Tracing is disabled or client is in no-op mode." 1607 ) 1608 return 1609 1610 current_otel_span = self._get_current_otel_span() 1611 1612 if current_otel_span is not None: 1613 span = LangfuseSpan( 1614 otel_span=current_otel_span, 1615 langfuse_client=self, 1616 environment=self._environment, 1617 ) 1618 1619 if name: 1620 current_otel_span.update_name(name) 1621 1622 span.update( 1623 input=input, 1624 output=output, 1625 metadata=metadata, 1626 version=version, 1627 level=level, 1628 status_message=status_message, 1629 )
Update the current active span with new information.
This method updates the current span in the active context with additional information. It's useful for adding outputs or metadata that become available during execution.
Arguments:
- name: The span name
- input: Updated input data for the operation
- output: Output data from the operation
- metadata: Additional metadata to associate with the span
- version: Version identifier for the code or component
- level: Importance level of the span (info, warning, error)
- status_message: Optional status message for the span
Example:
with langfuse.start_as_current_span(name="process-data") as span: # Initial processing result = process_first_part() # Update with intermediate results langfuse.update_current_span(metadata={"intermediate_result": result}) # Continue processing final_result = process_second_part(result) # Final update langfuse.update_current_span(output=final_result)
1631 def update_current_trace( 1632 self, 1633 *, 1634 name: Optional[str] = None, 1635 user_id: Optional[str] = None, 1636 session_id: Optional[str] = None, 1637 version: Optional[str] = None, 1638 input: Optional[Any] = None, 1639 output: Optional[Any] = None, 1640 metadata: Optional[Any] = None, 1641 tags: Optional[List[str]] = None, 1642 public: Optional[bool] = None, 1643 ) -> None: 1644 """Update the current trace with additional information. 1645 1646 Args: 1647 name: Updated name for the Langfuse trace 1648 user_id: ID of the user who initiated the Langfuse trace 1649 session_id: Session identifier for grouping related Langfuse traces 1650 version: Version identifier for the application or service 1651 input: Input data for the overall Langfuse trace 1652 output: Output data from the overall Langfuse trace 1653 metadata: Additional metadata to associate with the Langfuse trace 1654 tags: List of tags to categorize the Langfuse trace 1655 public: Whether the Langfuse trace should be publicly accessible 1656 1657 See Also: 1658 :func:`langfuse.propagate_attributes`: Recommended replacement 1659 """ 1660 if not self._tracing_enabled: 1661 langfuse_logger.debug( 1662 "Operation skipped: update_current_trace - Tracing is disabled or client is in no-op mode." 1663 ) 1664 return 1665 1666 current_otel_span = self._get_current_otel_span() 1667 1668 if current_otel_span is not None: 1669 existing_observation_type = current_otel_span.attributes.get( # type: ignore[attr-defined] 1670 LangfuseOtelSpanAttributes.OBSERVATION_TYPE, "span" 1671 ) 1672 # We need to preserve the class to keep the correct observation type 1673 span_class = self._get_span_class(existing_observation_type) 1674 span = span_class( 1675 otel_span=current_otel_span, 1676 langfuse_client=self, 1677 environment=self._environment, 1678 ) 1679 1680 span.update_trace( 1681 name=name, 1682 user_id=user_id, 1683 session_id=session_id, 1684 version=version, 1685 input=input, 1686 output=output, 1687 metadata=metadata, 1688 tags=tags, 1689 public=public, 1690 )
Update the current trace with additional information.
Arguments:
- name: Updated name for the Langfuse trace
- user_id: ID of the user who initiated the Langfuse trace
- session_id: Session identifier for grouping related Langfuse traces
- version: Version identifier for the application or service
- input: Input data for the overall Langfuse trace
- output: Output data from the overall Langfuse trace
- metadata: Additional metadata to associate with the Langfuse trace
- tags: List of tags to categorize the Langfuse trace
- public: Whether the Langfuse trace should be publicly accessible
See Also:
langfuse.propagate_attributes(): Recommended replacement
1692 def create_event( 1693 self, 1694 *, 1695 trace_context: Optional[TraceContext] = None, 1696 name: str, 1697 input: Optional[Any] = None, 1698 output: Optional[Any] = None, 1699 metadata: Optional[Any] = None, 1700 version: Optional[str] = None, 1701 level: Optional[SpanLevel] = None, 1702 status_message: Optional[str] = None, 1703 ) -> LangfuseEvent: 1704 """Create a new Langfuse observation of type 'EVENT'. 1705 1706 The created Langfuse Event observation will be the child of the current span in the context. 1707 1708 Args: 1709 trace_context: Optional context for connecting to an existing trace 1710 name: Name of the span (e.g., function or operation name) 1711 input: Input data for the operation (can be any JSON-serializable object) 1712 output: Output data from the operation (can be any JSON-serializable object) 1713 metadata: Additional metadata to associate with the span 1714 version: Version identifier for the code or component 1715 level: Importance level of the span (info, warning, error) 1716 status_message: Optional status message for the span 1717 1718 Returns: 1719 The Langfuse Event object 1720 1721 Example: 1722 ```python 1723 event = langfuse.create_event(name="process-event") 1724 ``` 1725 """ 1726 timestamp = time_ns() 1727 1728 if trace_context: 1729 trace_id = trace_context.get("trace_id", None) 1730 parent_span_id = trace_context.get("parent_span_id", None) 1731 1732 if trace_id: 1733 remote_parent_span = self._create_remote_parent_span( 1734 trace_id=trace_id, parent_span_id=parent_span_id 1735 ) 1736 1737 with otel_trace_api.use_span( 1738 cast(otel_trace_api.Span, remote_parent_span) 1739 ): 1740 otel_span = self._otel_tracer.start_span( 1741 name=name, start_time=timestamp 1742 ) 1743 otel_span.set_attribute(LangfuseOtelSpanAttributes.AS_ROOT, True) 1744 1745 return cast( 1746 LangfuseEvent, 1747 LangfuseEvent( 1748 otel_span=otel_span, 1749 langfuse_client=self, 1750 environment=self._environment, 1751 input=input, 1752 output=output, 1753 metadata=metadata, 1754 version=version, 1755 level=level, 1756 status_message=status_message, 1757 ).end(end_time=timestamp), 1758 ) 1759 1760 otel_span = self._otel_tracer.start_span(name=name, start_time=timestamp) 1761 1762 return cast( 1763 LangfuseEvent, 1764 LangfuseEvent( 1765 otel_span=otel_span, 1766 langfuse_client=self, 1767 environment=self._environment, 1768 input=input, 1769 output=output, 1770 metadata=metadata, 1771 version=version, 1772 level=level, 1773 status_message=status_message, 1774 ).end(end_time=timestamp), 1775 )
Create a new Langfuse observation of type 'EVENT'.
The created Langfuse Event observation will be the child of the current span in the context.
Arguments:
- trace_context: Optional context for connecting to an existing trace
- name: Name of the span (e.g., function or operation name)
- input: Input data for the operation (can be any JSON-serializable object)
- output: Output data from the operation (can be any JSON-serializable object)
- metadata: Additional metadata to associate with the span
- version: Version identifier for the code or component
- level: Importance level of the span (info, warning, error)
- status_message: Optional status message for the span
Returns:
The Langfuse Event object
Example:
event = langfuse.create_event(name="process-event")
1864 @staticmethod 1865 def create_trace_id(*, seed: Optional[str] = None) -> str: 1866 """Create a unique trace ID for use with Langfuse. 1867 1868 This method generates a unique trace ID for use with various Langfuse APIs. 1869 It can either generate a random ID or create a deterministic ID based on 1870 a seed string. 1871 1872 Trace IDs must be 32 lowercase hexadecimal characters, representing 16 bytes. 1873 This method ensures the generated ID meets this requirement. If you need to 1874 correlate an external ID with a Langfuse trace ID, use the external ID as the 1875 seed to get a valid, deterministic Langfuse trace ID. 1876 1877 Args: 1878 seed: Optional string to use as a seed for deterministic ID generation. 1879 If provided, the same seed will always produce the same ID. 1880 If not provided, a random ID will be generated. 1881 1882 Returns: 1883 A 32-character lowercase hexadecimal string representing the Langfuse trace ID. 1884 1885 Example: 1886 ```python 1887 # Generate a random trace ID 1888 trace_id = langfuse.create_trace_id() 1889 1890 # Generate a deterministic ID based on a seed 1891 session_trace_id = langfuse.create_trace_id(seed="session-456") 1892 1893 # Correlate an external ID with a Langfuse trace ID 1894 external_id = "external-system-123456" 1895 correlated_trace_id = langfuse.create_trace_id(seed=external_id) 1896 1897 # Use the ID with trace context 1898 with langfuse.start_as_current_span( 1899 name="process-request", 1900 trace_context={"trace_id": trace_id} 1901 ) as span: 1902 # Operation will be part of the specific trace 1903 pass 1904 ``` 1905 """ 1906 if not seed: 1907 trace_id_int = RandomIdGenerator().generate_trace_id() 1908 1909 return Langfuse._format_otel_trace_id(trace_id_int) 1910 1911 return sha256(seed.encode("utf-8")).digest()[:16].hex()
Create a unique trace ID for use with Langfuse.
This method generates a unique trace ID for use with various Langfuse APIs. It can either generate a random ID or create a deterministic ID based on a seed string.
Trace IDs must be 32 lowercase hexadecimal characters, representing 16 bytes. This method ensures the generated ID meets this requirement. If you need to correlate an external ID with a Langfuse trace ID, use the external ID as the seed to get a valid, deterministic Langfuse trace ID.
Arguments:
- seed: Optional string to use as a seed for deterministic ID generation. If provided, the same seed will always produce the same ID. If not provided, a random ID will be generated.
Returns:
A 32-character lowercase hexadecimal string representing the Langfuse trace ID.
Example:
# Generate a random trace ID trace_id = langfuse.create_trace_id() # Generate a deterministic ID based on a seed session_trace_id = langfuse.create_trace_id(seed="session-456") # Correlate an external ID with a Langfuse trace ID external_id = "external-system-123456" correlated_trace_id = langfuse.create_trace_id(seed=external_id) # Use the ID with trace context with langfuse.start_as_current_span( name="process-request", trace_context={"trace_id": trace_id} ) as span: # Operation will be part of the specific trace pass
1987 def create_score( 1988 self, 1989 *, 1990 name: str, 1991 value: Union[float, str], 1992 session_id: Optional[str] = None, 1993 dataset_run_id: Optional[str] = None, 1994 trace_id: Optional[str] = None, 1995 observation_id: Optional[str] = None, 1996 score_id: Optional[str] = None, 1997 data_type: Optional[ScoreDataType] = None, 1998 comment: Optional[str] = None, 1999 config_id: Optional[str] = None, 2000 metadata: Optional[Any] = None, 2001 ) -> None: 2002 """Create a score for a specific trace or observation. 2003 2004 This method creates a score for evaluating a Langfuse trace or observation. Scores can be 2005 used to track quality metrics, user feedback, or automated evaluations. 2006 2007 Args: 2008 name: Name of the score (e.g., "relevance", "accuracy") 2009 value: Score value (can be numeric for NUMERIC/BOOLEAN types or string for CATEGORICAL) 2010 session_id: ID of the Langfuse session to associate the score with 2011 dataset_run_id: ID of the Langfuse dataset run to associate the score with 2012 trace_id: ID of the Langfuse trace to associate the score with 2013 observation_id: Optional ID of the specific observation to score. Trace ID must be provided too. 2014 score_id: Optional custom ID for the score (auto-generated if not provided) 2015 data_type: Type of score (NUMERIC, BOOLEAN, or CATEGORICAL) 2016 comment: Optional comment or explanation for the score 2017 config_id: Optional ID of a score config defined in Langfuse 2018 metadata: Optional metadata to be attached to the score 2019 2020 Example: 2021 ```python 2022 # Create a numeric score for accuracy 2023 langfuse.create_score( 2024 name="accuracy", 2025 value=0.92, 2026 trace_id="abcdef1234567890abcdef1234567890", 2027 data_type="NUMERIC", 2028 comment="High accuracy with minor irrelevant details" 2029 ) 2030 2031 # Create a categorical score for sentiment 2032 langfuse.create_score( 2033 name="sentiment", 2034 value="positive", 2035 trace_id="abcdef1234567890abcdef1234567890", 2036 observation_id="abcdef1234567890", 2037 data_type="CATEGORICAL" 2038 ) 2039 ``` 2040 """ 2041 if not self._tracing_enabled: 2042 return 2043 2044 score_id = score_id or self._create_observation_id() 2045 2046 try: 2047 new_body = ScoreBody( 2048 id=score_id, 2049 sessionId=session_id, 2050 datasetRunId=dataset_run_id, 2051 traceId=trace_id, 2052 observationId=observation_id, 2053 name=name, 2054 value=value, 2055 dataType=data_type, # type: ignore 2056 comment=comment, 2057 configId=config_id, 2058 environment=self._environment, 2059 metadata=metadata, 2060 ) 2061 2062 event = { 2063 "id": self.create_trace_id(), 2064 "type": "score-create", 2065 "timestamp": _get_timestamp(), 2066 "body": new_body, 2067 } 2068 2069 if self._resources is not None: 2070 # Force the score to be in sample if it was for a legacy trace ID, i.e. non-32 hexchar 2071 force_sample = ( 2072 not self._is_valid_trace_id(trace_id) if trace_id else True 2073 ) 2074 2075 self._resources.add_score_task( 2076 event, 2077 force_sample=force_sample, 2078 ) 2079 2080 except Exception as e: 2081 langfuse_logger.exception( 2082 f"Error creating score: Failed to process score event for trace_id={trace_id}, name={name}. Error: {e}" 2083 )
Create a score for a specific trace or observation.
This method creates a score for evaluating a Langfuse trace or observation. Scores can be used to track quality metrics, user feedback, or automated evaluations.
Arguments:
- name: Name of the score (e.g., "relevance", "accuracy")
- value: Score value (can be numeric for NUMERIC/BOOLEAN types or string for CATEGORICAL)
- session_id: ID of the Langfuse session to associate the score with
- dataset_run_id: ID of the Langfuse dataset run to associate the score with
- trace_id: ID of the Langfuse trace to associate the score with
- observation_id: Optional ID of the specific observation to score. Trace ID must be provided too.
- score_id: Optional custom ID for the score (auto-generated if not provided)
- data_type: Type of score (NUMERIC, BOOLEAN, or CATEGORICAL)
- comment: Optional comment or explanation for the score
- config_id: Optional ID of a score config defined in Langfuse
- metadata: Optional metadata to be attached to the score
Example:
# Create a numeric score for accuracy langfuse.create_score( name="accuracy", value=0.92, trace_id="abcdef1234567890abcdef1234567890", data_type="NUMERIC", comment="High accuracy with minor irrelevant details" ) # Create a categorical score for sentiment langfuse.create_score( name="sentiment", value="positive", trace_id="abcdef1234567890abcdef1234567890", observation_id="abcdef1234567890", data_type="CATEGORICAL" )
2109 def score_current_span( 2110 self, 2111 *, 2112 name: str, 2113 value: Union[float, str], 2114 score_id: Optional[str] = None, 2115 data_type: Optional[ScoreDataType] = None, 2116 comment: Optional[str] = None, 2117 config_id: Optional[str] = None, 2118 ) -> None: 2119 """Create a score for the current active span. 2120 2121 This method scores the currently active span in the context. It's a convenient 2122 way to score the current operation without needing to know its trace and span IDs. 2123 2124 Args: 2125 name: Name of the score (e.g., "relevance", "accuracy") 2126 value: Score value (can be numeric for NUMERIC/BOOLEAN types or string for CATEGORICAL) 2127 score_id: Optional custom ID for the score (auto-generated if not provided) 2128 data_type: Type of score (NUMERIC, BOOLEAN, or CATEGORICAL) 2129 comment: Optional comment or explanation for the score 2130 config_id: Optional ID of a score config defined in Langfuse 2131 2132 Example: 2133 ```python 2134 with langfuse.start_as_current_generation(name="answer-query") as generation: 2135 # Generate answer 2136 response = generate_answer(...) 2137 generation.update(output=response) 2138 2139 # Score the generation 2140 langfuse.score_current_span( 2141 name="relevance", 2142 value=0.85, 2143 data_type="NUMERIC", 2144 comment="Mostly relevant but contains some tangential information" 2145 ) 2146 ``` 2147 """ 2148 current_span = self._get_current_otel_span() 2149 2150 if current_span is not None: 2151 trace_id = self._get_otel_trace_id(current_span) 2152 observation_id = self._get_otel_span_id(current_span) 2153 2154 langfuse_logger.info( 2155 f"Score: Creating score name='{name}' value={value} for current span ({observation_id}) in trace {trace_id}" 2156 ) 2157 2158 self.create_score( 2159 trace_id=trace_id, 2160 observation_id=observation_id, 2161 name=name, 2162 value=cast(str, value), 2163 score_id=score_id, 2164 data_type=cast(Literal["CATEGORICAL"], data_type), 2165 comment=comment, 2166 config_id=config_id, 2167 )
Create a score for the current active span.
This method scores the currently active span in the context. It's a convenient way to score the current operation without needing to know its trace and span IDs.
Arguments:
- name: Name of the score (e.g., "relevance", "accuracy")
- value: Score value (can be numeric for NUMERIC/BOOLEAN types or string for CATEGORICAL)
- score_id: Optional custom ID for the score (auto-generated if not provided)
- data_type: Type of score (NUMERIC, BOOLEAN, or CATEGORICAL)
- comment: Optional comment or explanation for the score
- config_id: Optional ID of a score config defined in Langfuse
Example:
with langfuse.start_as_current_generation(name="answer-query") as generation: # Generate answer response = generate_answer(...) generation.update(output=response) # Score the generation langfuse.score_current_span( name="relevance", value=0.85, data_type="NUMERIC", comment="Mostly relevant but contains some tangential information" )
2193 def score_current_trace( 2194 self, 2195 *, 2196 name: str, 2197 value: Union[float, str], 2198 score_id: Optional[str] = None, 2199 data_type: Optional[ScoreDataType] = None, 2200 comment: Optional[str] = None, 2201 config_id: Optional[str] = None, 2202 ) -> None: 2203 """Create a score for the current trace. 2204 2205 This method scores the trace of the currently active span. Unlike score_current_span, 2206 this method associates the score with the entire trace rather than a specific span. 2207 It's useful for scoring overall performance or quality of the entire operation. 2208 2209 Args: 2210 name: Name of the score (e.g., "user_satisfaction", "overall_quality") 2211 value: Score value (can be numeric for NUMERIC/BOOLEAN types or string for CATEGORICAL) 2212 score_id: Optional custom ID for the score (auto-generated if not provided) 2213 data_type: Type of score (NUMERIC, BOOLEAN, or CATEGORICAL) 2214 comment: Optional comment or explanation for the score 2215 config_id: Optional ID of a score config defined in Langfuse 2216 2217 Example: 2218 ```python 2219 with langfuse.start_as_current_span(name="process-user-request") as span: 2220 # Process request 2221 result = process_complete_request() 2222 span.update(output=result) 2223 2224 # Score the overall trace 2225 langfuse.score_current_trace( 2226 name="overall_quality", 2227 value=0.95, 2228 data_type="NUMERIC", 2229 comment="High quality end-to-end response" 2230 ) 2231 ``` 2232 """ 2233 current_span = self._get_current_otel_span() 2234 2235 if current_span is not None: 2236 trace_id = self._get_otel_trace_id(current_span) 2237 2238 langfuse_logger.info( 2239 f"Score: Creating score name='{name}' value={value} for entire trace {trace_id}" 2240 ) 2241 2242 self.create_score( 2243 trace_id=trace_id, 2244 name=name, 2245 value=cast(str, value), 2246 score_id=score_id, 2247 data_type=cast(Literal["CATEGORICAL"], data_type), 2248 comment=comment, 2249 config_id=config_id, 2250 )
Create a score for the current trace.
This method scores the trace of the currently active span. Unlike score_current_span, this method associates the score with the entire trace rather than a specific span. It's useful for scoring overall performance or quality of the entire operation.
Arguments:
- name: Name of the score (e.g., "user_satisfaction", "overall_quality")
- value: Score value (can be numeric for NUMERIC/BOOLEAN types or string for CATEGORICAL)
- score_id: Optional custom ID for the score (auto-generated if not provided)
- data_type: Type of score (NUMERIC, BOOLEAN, or CATEGORICAL)
- comment: Optional comment or explanation for the score
- config_id: Optional ID of a score config defined in Langfuse
Example:
with langfuse.start_as_current_span(name="process-user-request") as span: # Process request result = process_complete_request() span.update(output=result) # Score the overall trace langfuse.score_current_trace( name="overall_quality", value=0.95, data_type="NUMERIC", comment="High quality end-to-end response" )
2252 def flush(self) -> None: 2253 """Force flush all pending spans and events to the Langfuse API. 2254 2255 This method manually flushes any pending spans, scores, and other events to the 2256 Langfuse API. It's useful in scenarios where you want to ensure all data is sent 2257 before proceeding, without waiting for the automatic flush interval. 2258 2259 Example: 2260 ```python 2261 # Record some spans and scores 2262 with langfuse.start_as_current_span(name="operation") as span: 2263 # Do work... 2264 pass 2265 2266 # Ensure all data is sent to Langfuse before proceeding 2267 langfuse.flush() 2268 2269 # Continue with other work 2270 ``` 2271 """ 2272 if self._resources is not None: 2273 self._resources.flush()
Force flush all pending spans and events to the Langfuse API.
This method manually flushes any pending spans, scores, and other events to the Langfuse API. It's useful in scenarios where you want to ensure all data is sent before proceeding, without waiting for the automatic flush interval.
Example:
# Record some spans and scores with langfuse.start_as_current_span(name="operation") as span: # Do work... pass # Ensure all data is sent to Langfuse before proceeding langfuse.flush() # Continue with other work
2275 def shutdown(self) -> None: 2276 """Shut down the Langfuse client and flush all pending data. 2277 2278 This method cleanly shuts down the Langfuse client, ensuring all pending data 2279 is flushed to the API and all background threads are properly terminated. 2280 2281 It's important to call this method when your application is shutting down to 2282 prevent data loss and resource leaks. For most applications, using the client 2283 as a context manager or relying on the automatic shutdown via atexit is sufficient. 2284 2285 Example: 2286 ```python 2287 # Initialize Langfuse 2288 langfuse = Langfuse(public_key="...", secret_key="...") 2289 2290 # Use Langfuse throughout your application 2291 # ... 2292 2293 # When application is shutting down 2294 langfuse.shutdown() 2295 ``` 2296 """ 2297 if self._resources is not None: 2298 self._resources.shutdown()
Shut down the Langfuse client and flush all pending data.
This method cleanly shuts down the Langfuse client, ensuring all pending data is flushed to the API and all background threads are properly terminated.
It's important to call this method when your application is shutting down to prevent data loss and resource leaks. For most applications, using the client as a context manager or relying on the automatic shutdown via atexit is sufficient.
Example:
# Initialize Langfuse langfuse = Langfuse(public_key="...", secret_key="...") # Use Langfuse throughout your application # ... # When application is shutting down langfuse.shutdown()
2300 def get_current_trace_id(self) -> Optional[str]: 2301 """Get the trace ID of the current active span. 2302 2303 This method retrieves the trace ID from the currently active span in the context. 2304 It can be used to get the trace ID for referencing in logs, external systems, 2305 or for creating related operations. 2306 2307 Returns: 2308 The current trace ID as a 32-character lowercase hexadecimal string, 2309 or None if there is no active span. 2310 2311 Example: 2312 ```python 2313 with langfuse.start_as_current_span(name="process-request") as span: 2314 # Get the current trace ID for reference 2315 trace_id = langfuse.get_current_trace_id() 2316 2317 # Use it for external correlation 2318 log.info(f"Processing request with trace_id: {trace_id}") 2319 2320 # Or pass to another system 2321 external_system.process(data, trace_id=trace_id) 2322 ``` 2323 """ 2324 if not self._tracing_enabled: 2325 langfuse_logger.debug( 2326 "Operation skipped: get_current_trace_id - Tracing is disabled or client is in no-op mode." 2327 ) 2328 return None 2329 2330 current_otel_span = self._get_current_otel_span() 2331 2332 return self._get_otel_trace_id(current_otel_span) if current_otel_span else None
Get the trace ID of the current active span.
This method retrieves the trace ID from the currently active span in the context. It can be used to get the trace ID for referencing in logs, external systems, or for creating related operations.
Returns:
The current trace ID as a 32-character lowercase hexadecimal string, or None if there is no active span.
Example:
with langfuse.start_as_current_span(name="process-request") as span: # Get the current trace ID for reference trace_id = langfuse.get_current_trace_id() # Use it for external correlation log.info(f"Processing request with trace_id: {trace_id}") # Or pass to another system external_system.process(data, trace_id=trace_id)
2334 def get_current_observation_id(self) -> Optional[str]: 2335 """Get the observation ID (span ID) of the current active span. 2336 2337 This method retrieves the observation ID from the currently active span in the context. 2338 It can be used to get the observation ID for referencing in logs, external systems, 2339 or for creating scores or other related operations. 2340 2341 Returns: 2342 The current observation ID as a 16-character lowercase hexadecimal string, 2343 or None if there is no active span. 2344 2345 Example: 2346 ```python 2347 with langfuse.start_as_current_span(name="process-user-query") as span: 2348 # Get the current observation ID 2349 observation_id = langfuse.get_current_observation_id() 2350 2351 # Store it for later reference 2352 cache.set(f"query_{query_id}_observation", observation_id) 2353 2354 # Process the query... 2355 ``` 2356 """ 2357 if not self._tracing_enabled: 2358 langfuse_logger.debug( 2359 "Operation skipped: get_current_observation_id - Tracing is disabled or client is in no-op mode." 2360 ) 2361 return None 2362 2363 current_otel_span = self._get_current_otel_span() 2364 2365 return self._get_otel_span_id(current_otel_span) if current_otel_span else None
Get the observation ID (span ID) of the current active span.
This method retrieves the observation ID from the currently active span in the context. It can be used to get the observation ID for referencing in logs, external systems, or for creating scores or other related operations.
Returns:
The current observation ID as a 16-character lowercase hexadecimal string, or None if there is no active span.
Example:
with langfuse.start_as_current_span(name="process-user-query") as span: # Get the current observation ID observation_id = langfuse.get_current_observation_id() # Store it for later reference cache.set(f"query_{query_id}_observation", observation_id) # Process the query...
2378 def get_trace_url(self, *, trace_id: Optional[str] = None) -> Optional[str]: 2379 """Get the URL to view a trace in the Langfuse UI. 2380 2381 This method generates a URL that links directly to a trace in the Langfuse UI. 2382 It's useful for providing links in logs, notifications, or debugging tools. 2383 2384 Args: 2385 trace_id: Optional trace ID to generate a URL for. If not provided, 2386 the trace ID of the current active span will be used. 2387 2388 Returns: 2389 A URL string pointing to the trace in the Langfuse UI, 2390 or None if the project ID couldn't be retrieved or no trace ID is available. 2391 2392 Example: 2393 ```python 2394 # Get URL for the current trace 2395 with langfuse.start_as_current_span(name="process-request") as span: 2396 trace_url = langfuse.get_trace_url() 2397 log.info(f"Processing trace: {trace_url}") 2398 2399 # Get URL for a specific trace 2400 specific_trace_url = langfuse.get_trace_url(trace_id="1234567890abcdef1234567890abcdef") 2401 send_notification(f"Review needed for trace: {specific_trace_url}") 2402 ``` 2403 """ 2404 project_id = self._get_project_id() 2405 final_trace_id = trace_id or self.get_current_trace_id() 2406 2407 return ( 2408 f"{self._base_url}/project/{project_id}/traces/{final_trace_id}" 2409 if project_id and final_trace_id 2410 else None 2411 )
Get the URL to view a trace in the Langfuse UI.
This method generates a URL that links directly to a trace in the Langfuse UI. It's useful for providing links in logs, notifications, or debugging tools.
Arguments:
- trace_id: Optional trace ID to generate a URL for. If not provided, the trace ID of the current active span will be used.
Returns:
A URL string pointing to the trace in the Langfuse UI, or None if the project ID couldn't be retrieved or no trace ID is available.
Example:
# Get URL for the current trace with langfuse.start_as_current_span(name="process-request") as span: trace_url = langfuse.get_trace_url() log.info(f"Processing trace: {trace_url}") # Get URL for a specific trace specific_trace_url = langfuse.get_trace_url(trace_id="1234567890abcdef1234567890abcdef") send_notification(f"Review needed for trace: {specific_trace_url}")
2413 def get_dataset( 2414 self, name: str, *, fetch_items_page_size: Optional[int] = 50 2415 ) -> "DatasetClient": 2416 """Fetch a dataset by its name. 2417 2418 Args: 2419 name (str): The name of the dataset to fetch. 2420 fetch_items_page_size (Optional[int]): All items of the dataset will be fetched in chunks of this size. Defaults to 50. 2421 2422 Returns: 2423 DatasetClient: The dataset with the given name. 2424 """ 2425 try: 2426 langfuse_logger.debug(f"Getting datasets {name}") 2427 dataset = self.api.datasets.get(dataset_name=name) 2428 2429 dataset_items = [] 2430 page = 1 2431 2432 while True: 2433 new_items = self.api.dataset_items.list( 2434 dataset_name=self._url_encode(name, is_url_param=True), 2435 page=page, 2436 limit=fetch_items_page_size, 2437 ) 2438 dataset_items.extend(new_items.data) 2439 2440 if new_items.meta.total_pages <= page: 2441 break 2442 2443 page += 1 2444 2445 items = [DatasetItemClient(i, langfuse=self) for i in dataset_items] 2446 2447 return DatasetClient(dataset, items=items) 2448 2449 except Error as e: 2450 handle_fern_exception(e) 2451 raise e
Fetch a dataset by its name.
Arguments:
- name (str): The name of the dataset to fetch.
- fetch_items_page_size (Optional[int]): All items of the dataset will be fetched in chunks of this size. Defaults to 50.
Returns:
DatasetClient: The dataset with the given name.
2453 def run_experiment( 2454 self, 2455 *, 2456 name: str, 2457 run_name: Optional[str] = None, 2458 description: Optional[str] = None, 2459 data: ExperimentData, 2460 task: TaskFunction, 2461 evaluators: List[EvaluatorFunction] = [], 2462 run_evaluators: List[RunEvaluatorFunction] = [], 2463 max_concurrency: int = 50, 2464 metadata: Optional[Dict[str, str]] = None, 2465 ) -> ExperimentResult: 2466 """Run an experiment on a dataset with automatic tracing and evaluation. 2467 2468 This method executes a task function on each item in the provided dataset, 2469 automatically traces all executions with Langfuse for observability, runs 2470 item-level and run-level evaluators on the outputs, and returns comprehensive 2471 results with evaluation metrics. 2472 2473 The experiment system provides: 2474 - Automatic tracing of all task executions 2475 - Concurrent processing with configurable limits 2476 - Comprehensive error handling that isolates failures 2477 - Integration with Langfuse datasets for experiment tracking 2478 - Flexible evaluation framework supporting both sync and async evaluators 2479 2480 Args: 2481 name: Human-readable name for the experiment. Used for identification 2482 in the Langfuse UI. 2483 run_name: Optional exact name for the experiment run. If provided, this will be 2484 used as the exact dataset run name if the `data` contains Langfuse dataset items. 2485 If not provided, this will default to the experiment name appended with an ISO timestamp. 2486 description: Optional description explaining the experiment's purpose, 2487 methodology, or expected outcomes. 2488 data: Array of data items to process. Can be either: 2489 - List of dict-like items with 'input', 'expected_output', 'metadata' keys 2490 - List of Langfuse DatasetItem objects from dataset.items 2491 task: Function that processes each data item and returns output. 2492 Must accept 'item' as keyword argument and can return sync or async results. 2493 The task function signature should be: task(*, item, **kwargs) -> Any 2494 evaluators: List of functions to evaluate each item's output individually. 2495 Each evaluator receives input, output, expected_output, and metadata. 2496 Can return single Evaluation dict or list of Evaluation dicts. 2497 run_evaluators: List of functions to evaluate the entire experiment run. 2498 Each run evaluator receives all item_results and can compute aggregate metrics. 2499 Useful for calculating averages, distributions, or cross-item comparisons. 2500 max_concurrency: Maximum number of concurrent task executions (default: 50). 2501 Controls the number of items processed simultaneously. Adjust based on 2502 API rate limits and system resources. 2503 metadata: Optional metadata dictionary to attach to all experiment traces. 2504 This metadata will be included in every trace created during the experiment. 2505 If `data` are Langfuse dataset items, the metadata will be attached to the dataset run, too. 2506 2507 Returns: 2508 ExperimentResult containing: 2509 - run_name: The experiment run name. This is equal to the dataset run name if experiment was on Langfuse dataset. 2510 - item_results: List of results for each processed item with outputs and evaluations 2511 - run_evaluations: List of aggregate evaluation results for the entire run 2512 - dataset_run_id: ID of the dataset run (if using Langfuse datasets) 2513 - dataset_run_url: Direct URL to view results in Langfuse UI (if applicable) 2514 2515 Raises: 2516 ValueError: If required parameters are missing or invalid 2517 Exception: If experiment setup fails (individual item failures are handled gracefully) 2518 2519 Examples: 2520 Basic experiment with local data: 2521 ```python 2522 def summarize_text(*, item, **kwargs): 2523 return f"Summary: {item['input'][:50]}..." 2524 2525 def length_evaluator(*, input, output, expected_output=None, **kwargs): 2526 return { 2527 "name": "output_length", 2528 "value": len(output), 2529 "comment": f"Output contains {len(output)} characters" 2530 } 2531 2532 result = langfuse.run_experiment( 2533 name="Text Summarization Test", 2534 description="Evaluate summarization quality and length", 2535 data=[ 2536 {"input": "Long article text...", "expected_output": "Expected summary"}, 2537 {"input": "Another article...", "expected_output": "Another summary"} 2538 ], 2539 task=summarize_text, 2540 evaluators=[length_evaluator] 2541 ) 2542 2543 print(f"Processed {len(result.item_results)} items") 2544 for item_result in result.item_results: 2545 print(f"Input: {item_result.item['input']}") 2546 print(f"Output: {item_result.output}") 2547 print(f"Evaluations: {item_result.evaluations}") 2548 ``` 2549 2550 Advanced experiment with async task and multiple evaluators: 2551 ```python 2552 async def llm_task(*, item, **kwargs): 2553 # Simulate async LLM call 2554 response = await openai_client.chat.completions.create( 2555 model="gpt-4", 2556 messages=[{"role": "user", "content": item["input"]}] 2557 ) 2558 return response.choices[0].message.content 2559 2560 def accuracy_evaluator(*, input, output, expected_output=None, **kwargs): 2561 if expected_output and expected_output.lower() in output.lower(): 2562 return {"name": "accuracy", "value": 1.0, "comment": "Correct answer"} 2563 return {"name": "accuracy", "value": 0.0, "comment": "Incorrect answer"} 2564 2565 def toxicity_evaluator(*, input, output, expected_output=None, **kwargs): 2566 # Simulate toxicity check 2567 toxicity_score = check_toxicity(output) # Your toxicity checker 2568 return { 2569 "name": "toxicity", 2570 "value": toxicity_score, 2571 "comment": f"Toxicity level: {'high' if toxicity_score > 0.7 else 'low'}" 2572 } 2573 2574 def average_accuracy(*, item_results, **kwargs): 2575 accuracies = [ 2576 eval.value for result in item_results 2577 for eval in result.evaluations 2578 if eval.name == "accuracy" 2579 ] 2580 return { 2581 "name": "average_accuracy", 2582 "value": sum(accuracies) / len(accuracies) if accuracies else 0, 2583 "comment": f"Average accuracy across {len(accuracies)} items" 2584 } 2585 2586 result = langfuse.run_experiment( 2587 name="LLM Safety and Accuracy Test", 2588 description="Evaluate model accuracy and safety across diverse prompts", 2589 data=test_dataset, # Your dataset items 2590 task=llm_task, 2591 evaluators=[accuracy_evaluator, toxicity_evaluator], 2592 run_evaluators=[average_accuracy], 2593 max_concurrency=5, # Limit concurrent API calls 2594 metadata={"model": "gpt-4", "temperature": 0.7} 2595 ) 2596 ``` 2597 2598 Using with Langfuse datasets: 2599 ```python 2600 # Get dataset from Langfuse 2601 dataset = langfuse.get_dataset("my-eval-dataset") 2602 2603 result = dataset.run_experiment( 2604 name="Production Model Evaluation", 2605 description="Monthly evaluation of production model performance", 2606 task=my_production_task, 2607 evaluators=[accuracy_evaluator, latency_evaluator] 2608 ) 2609 2610 # Results automatically linked to dataset in Langfuse UI 2611 print(f"View results: {result['dataset_run_url']}") 2612 ``` 2613 2614 Note: 2615 - Task and evaluator functions can be either synchronous or asynchronous 2616 - Individual item failures are logged but don't stop the experiment 2617 - All executions are automatically traced and visible in Langfuse UI 2618 - When using Langfuse datasets, results are automatically linked for easy comparison 2619 - This method works in both sync and async contexts (Jupyter notebooks, web apps, etc.) 2620 - Async execution is handled automatically with smart event loop detection 2621 """ 2622 return cast( 2623 ExperimentResult, 2624 run_async_safely( 2625 self._run_experiment_async( 2626 name=name, 2627 run_name=self._create_experiment_run_name( 2628 name=name, run_name=run_name 2629 ), 2630 description=description, 2631 data=data, 2632 task=task, 2633 evaluators=evaluators or [], 2634 run_evaluators=run_evaluators or [], 2635 max_concurrency=max_concurrency, 2636 metadata=metadata, 2637 ), 2638 ), 2639 )
Run an experiment on a dataset with automatic tracing and evaluation.
This method executes a task function on each item in the provided dataset, automatically traces all executions with Langfuse for observability, runs item-level and run-level evaluators on the outputs, and returns comprehensive results with evaluation metrics.
The experiment system provides:
- Automatic tracing of all task executions
- Concurrent processing with configurable limits
- Comprehensive error handling that isolates failures
- Integration with Langfuse datasets for experiment tracking
- Flexible evaluation framework supporting both sync and async evaluators
Arguments:
- name: Human-readable name for the experiment. Used for identification in the Langfuse UI.
- run_name: Optional exact name for the experiment run. If provided, this will be
used as the exact dataset run name if the
datacontains Langfuse dataset items. If not provided, this will default to the experiment name appended with an ISO timestamp. - description: Optional description explaining the experiment's purpose, methodology, or expected outcomes.
- data: Array of data items to process. Can be either:
- List of dict-like items with 'input', 'expected_output', 'metadata' keys
- List of Langfuse DatasetItem objects from dataset.items
- task: Function that processes each data item and returns output. Must accept 'item' as keyword argument and can return sync or async results. The task function signature should be: task(, item, *kwargs) -> Any
- evaluators: List of functions to evaluate each item's output individually. Each evaluator receives input, output, expected_output, and metadata. Can return single Evaluation dict or list of Evaluation dicts.
- run_evaluators: List of functions to evaluate the entire experiment run. Each run evaluator receives all item_results and can compute aggregate metrics. Useful for calculating averages, distributions, or cross-item comparisons.
- max_concurrency: Maximum number of concurrent task executions (default: 50). Controls the number of items processed simultaneously. Adjust based on API rate limits and system resources.
- metadata: Optional metadata dictionary to attach to all experiment traces.
This metadata will be included in every trace created during the experiment.
If
dataare Langfuse dataset items, the metadata will be attached to the dataset run, too.
Returns:
ExperimentResult containing:
- run_name: The experiment run name. This is equal to the dataset run name if experiment was on Langfuse dataset.
- item_results: List of results for each processed item with outputs and evaluations
- run_evaluations: List of aggregate evaluation results for the entire run
- dataset_run_id: ID of the dataset run (if using Langfuse datasets)
- dataset_run_url: Direct URL to view results in Langfuse UI (if applicable)
Raises:
- ValueError: If required parameters are missing or invalid
- Exception: If experiment setup fails (individual item failures are handled gracefully)
Examples:
Basic experiment with local data:
def summarize_text(*, item, **kwargs): return f"Summary: {item['input'][:50]}..." def length_evaluator(*, input, output, expected_output=None, **kwargs): return { "name": "output_length", "value": len(output), "comment": f"Output contains {len(output)} characters" } result = langfuse.run_experiment( name="Text Summarization Test", description="Evaluate summarization quality and length", data=[ {"input": "Long article text...", "expected_output": "Expected summary"}, {"input": "Another article...", "expected_output": "Another summary"} ], task=summarize_text, evaluators=[length_evaluator] ) print(f"Processed {len(result.item_results)} items") for item_result in result.item_results: print(f"Input: {item_result.item['input']}") print(f"Output: {item_result.output}") print(f"Evaluations: {item_result.evaluations}")Advanced experiment with async task and multiple evaluators:
async def llm_task(*, item, **kwargs): # Simulate async LLM call response = await openai_client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": item["input"]}] ) return response.choices[0].message.content def accuracy_evaluator(*, input, output, expected_output=None, **kwargs): if expected_output and expected_output.lower() in output.lower(): return {"name": "accuracy", "value": 1.0, "comment": "Correct answer"} return {"name": "accuracy", "value": 0.0, "comment": "Incorrect answer"} def toxicity_evaluator(*, input, output, expected_output=None, **kwargs): # Simulate toxicity check toxicity_score = check_toxicity(output) # Your toxicity checker return { "name": "toxicity", "value": toxicity_score, "comment": f"Toxicity level: {'high' if toxicity_score > 0.7 else 'low'}" } def average_accuracy(*, item_results, **kwargs): accuracies = [ eval.value for result in item_results for eval in result.evaluations if eval.name == "accuracy" ] return { "name": "average_accuracy", "value": sum(accuracies) / len(accuracies) if accuracies else 0, "comment": f"Average accuracy across {len(accuracies)} items" } result = langfuse.run_experiment( name="LLM Safety and Accuracy Test", description="Evaluate model accuracy and safety across diverse prompts", data=test_dataset, # Your dataset items task=llm_task, evaluators=[accuracy_evaluator, toxicity_evaluator], run_evaluators=[average_accuracy], max_concurrency=5, # Limit concurrent API calls metadata={"model": "gpt-4", "temperature": 0.7} )Using with Langfuse datasets:
# Get dataset from Langfuse dataset = langfuse.get_dataset("my-eval-dataset") result = dataset.run_experiment( name="Production Model Evaluation", description="Monthly evaluation of production model performance", task=my_production_task, evaluators=[accuracy_evaluator, latency_evaluator] ) # Results automatically linked to dataset in Langfuse UI print(f"View results: {result['dataset_run_url']}")
Note:
- Task and evaluator functions can be either synchronous or asynchronous
- Individual item failures are logged but don't stop the experiment
- All executions are automatically traced and visible in Langfuse UI
- When using Langfuse datasets, results are automatically linked for easy comparison
- This method works in both sync and async contexts (Jupyter notebooks, web apps, etc.)
- Async execution is handled automatically with smart event loop detection
2923 def auth_check(self) -> bool: 2924 """Check if the provided credentials (public and secret key) are valid. 2925 2926 Raises: 2927 Exception: If no projects were found for the provided credentials. 2928 2929 Note: 2930 This method is blocking. It is discouraged to use it in production code. 2931 """ 2932 try: 2933 projects = self.api.projects.get() 2934 langfuse_logger.debug( 2935 f"Auth check successful, found {len(projects.data)} projects" 2936 ) 2937 if len(projects.data) == 0: 2938 raise Exception( 2939 "Auth check failed, no project found for the keys provided." 2940 ) 2941 return True 2942 2943 except AttributeError as e: 2944 langfuse_logger.warning( 2945 f"Auth check failed: Client not properly initialized. Error: {e}" 2946 ) 2947 return False 2948 2949 except Error as e: 2950 handle_fern_exception(e) 2951 raise e
Check if the provided credentials (public and secret key) are valid.
Raises:
- Exception: If no projects were found for the provided credentials.
Note:
This method is blocking. It is discouraged to use it in production code.
2953 def create_dataset( 2954 self, 2955 *, 2956 name: str, 2957 description: Optional[str] = None, 2958 metadata: Optional[Any] = None, 2959 ) -> Dataset: 2960 """Create a dataset with the given name on Langfuse. 2961 2962 Args: 2963 name: Name of the dataset to create. 2964 description: Description of the dataset. Defaults to None. 2965 metadata: Additional metadata. Defaults to None. 2966 2967 Returns: 2968 Dataset: The created dataset as returned by the Langfuse API. 2969 """ 2970 try: 2971 body = CreateDatasetRequest( 2972 name=name, description=description, metadata=metadata 2973 ) 2974 langfuse_logger.debug(f"Creating datasets {body}") 2975 2976 return self.api.datasets.create(request=body) 2977 2978 except Error as e: 2979 handle_fern_exception(e) 2980 raise e
Create a dataset with the given name on Langfuse.
Arguments:
- name: Name of the dataset to create.
- description: Description of the dataset. Defaults to None.
- metadata: Additional metadata. Defaults to None.
Returns:
Dataset: The created dataset as returned by the Langfuse API.
2982 def create_dataset_item( 2983 self, 2984 *, 2985 dataset_name: str, 2986 input: Optional[Any] = None, 2987 expected_output: Optional[Any] = None, 2988 metadata: Optional[Any] = None, 2989 source_trace_id: Optional[str] = None, 2990 source_observation_id: Optional[str] = None, 2991 status: Optional[DatasetStatus] = None, 2992 id: Optional[str] = None, 2993 ) -> DatasetItem: 2994 """Create a dataset item. 2995 2996 Upserts if an item with id already exists. 2997 2998 Args: 2999 dataset_name: Name of the dataset in which the dataset item should be created. 3000 input: Input data. Defaults to None. Can contain any dict, list or scalar. 3001 expected_output: Expected output data. Defaults to None. Can contain any dict, list or scalar. 3002 metadata: Additional metadata. Defaults to None. Can contain any dict, list or scalar. 3003 source_trace_id: Id of the source trace. Defaults to None. 3004 source_observation_id: Id of the source observation. Defaults to None. 3005 status: Status of the dataset item. Defaults to ACTIVE for newly created items. 3006 id: Id of the dataset item. Defaults to None. Provide your own id if you want to dedupe dataset items. Id needs to be globally unique and cannot be reused across datasets. 3007 3008 Returns: 3009 DatasetItem: The created dataset item as returned by the Langfuse API. 3010 3011 Example: 3012 ```python 3013 from langfuse import Langfuse 3014 3015 langfuse = Langfuse() 3016 3017 # Uploading items to the Langfuse dataset named "capital_cities" 3018 langfuse.create_dataset_item( 3019 dataset_name="capital_cities", 3020 input={"input": {"country": "Italy"}}, 3021 expected_output={"expected_output": "Rome"}, 3022 metadata={"foo": "bar"} 3023 ) 3024 ``` 3025 """ 3026 try: 3027 body = CreateDatasetItemRequest( 3028 datasetName=dataset_name, 3029 input=input, 3030 expectedOutput=expected_output, 3031 metadata=metadata, 3032 sourceTraceId=source_trace_id, 3033 sourceObservationId=source_observation_id, 3034 status=status, 3035 id=id, 3036 ) 3037 langfuse_logger.debug(f"Creating dataset item {body}") 3038 return self.api.dataset_items.create(request=body) 3039 except Error as e: 3040 handle_fern_exception(e) 3041 raise e
Create a dataset item.
Upserts if an item with id already exists.
Arguments:
- dataset_name: Name of the dataset in which the dataset item should be created.
- input: Input data. Defaults to None. Can contain any dict, list or scalar.
- expected_output: Expected output data. Defaults to None. Can contain any dict, list or scalar.
- metadata: Additional metadata. Defaults to None. Can contain any dict, list or scalar.
- source_trace_id: Id of the source trace. Defaults to None.
- source_observation_id: Id of the source observation. Defaults to None.
- status: Status of the dataset item. Defaults to ACTIVE for newly created items.
- id: Id of the dataset item. Defaults to None. Provide your own id if you want to dedupe dataset items. Id needs to be globally unique and cannot be reused across datasets.
Returns:
DatasetItem: The created dataset item as returned by the Langfuse API.
Example:
from langfuse import Langfuse langfuse = Langfuse() # Uploading items to the Langfuse dataset named "capital_cities" langfuse.create_dataset_item( dataset_name="capital_cities", input={"input": {"country": "Italy"}}, expected_output={"expected_output": "Rome"}, metadata={"foo": "bar"} )
3043 def resolve_media_references( 3044 self, 3045 *, 3046 obj: Any, 3047 resolve_with: Literal["base64_data_uri"], 3048 max_depth: int = 10, 3049 content_fetch_timeout_seconds: int = 5, 3050 ) -> Any: 3051 """Replace media reference strings in an object with base64 data URIs. 3052 3053 This method recursively traverses an object (up to max_depth) looking for media reference strings 3054 in the format "@@@langfuseMedia:...@@@". When found, it (synchronously) fetches the actual media content using 3055 the provided Langfuse client and replaces the reference string with a base64 data URI. 3056 3057 If fetching media content fails for a reference string, a warning is logged and the reference 3058 string is left unchanged. 3059 3060 Args: 3061 obj: The object to process. Can be a primitive value, array, or nested object. 3062 If the object has a __dict__ attribute, a dict will be returned instead of the original object type. 3063 resolve_with: The representation of the media content to replace the media reference string with. 3064 Currently only "base64_data_uri" is supported. 3065 max_depth: int: The maximum depth to traverse the object. Default is 10. 3066 content_fetch_timeout_seconds: int: The timeout in seconds for fetching media content. Default is 5. 3067 3068 Returns: 3069 A deep copy of the input object with all media references replaced with base64 data URIs where possible. 3070 If the input object has a __dict__ attribute, a dict will be returned instead of the original object type. 3071 3072 Example: 3073 obj = { 3074 "image": "@@@langfuseMedia:type=image/jpeg|id=123|source=bytes@@@", 3075 "nested": { 3076 "pdf": "@@@langfuseMedia:type=application/pdf|id=456|source=bytes@@@" 3077 } 3078 } 3079 3080 result = await LangfuseMedia.resolve_media_references(obj, langfuse_client) 3081 3082 # Result: 3083 # { 3084 # "image": "...", 3085 # "nested": { 3086 # "pdf": "data:application/pdf;base64,JVBERi0xLjcK..." 3087 # } 3088 # } 3089 """ 3090 return LangfuseMedia.resolve_media_references( 3091 langfuse_client=self, 3092 obj=obj, 3093 resolve_with=resolve_with, 3094 max_depth=max_depth, 3095 content_fetch_timeout_seconds=content_fetch_timeout_seconds, 3096 )
Replace media reference strings in an object with base64 data URIs.
This method recursively traverses an object (up to max_depth) looking for media reference strings in the format "@@@langfuseMedia:...@@@". When found, it (synchronously) fetches the actual media content using the provided Langfuse client and replaces the reference string with a base64 data URI.
If fetching media content fails for a reference string, a warning is logged and the reference string is left unchanged.
Arguments:
- obj: The object to process. Can be a primitive value, array, or nested object. If the object has a __dict__ attribute, a dict will be returned instead of the original object type.
- resolve_with: The representation of the media content to replace the media reference string with. Currently only "base64_data_uri" is supported.
- max_depth: int: The maximum depth to traverse the object. Default is 10.
- content_fetch_timeout_seconds: int: The timeout in seconds for fetching media content. Default is 5.
Returns:
A deep copy of the input object with all media references replaced with base64 data URIs where possible. If the input object has a __dict__ attribute, a dict will be returned instead of the original object type.
Example:
obj = { "image": "@@@langfuseMedia:type=image/jpeg|id=123|source=bytes@@@", "nested": { "pdf": "@@@langfuseMedia:type=application/pdf|id=456|source=bytes@@@" } }
result = await LangfuseMedia.resolve_media_references(obj, langfuse_client)
Result:
{
"image": "...",
"nested": {
"pdf": "data:application/pdf;base64,JVBERi0xLjcK..."
}
}
3126 def get_prompt( 3127 self, 3128 name: str, 3129 *, 3130 version: Optional[int] = None, 3131 label: Optional[str] = None, 3132 type: Literal["chat", "text"] = "text", 3133 cache_ttl_seconds: Optional[int] = None, 3134 fallback: Union[Optional[List[ChatMessageDict]], Optional[str]] = None, 3135 max_retries: Optional[int] = None, 3136 fetch_timeout_seconds: Optional[int] = None, 3137 ) -> PromptClient: 3138 """Get a prompt. 3139 3140 This method attempts to fetch the requested prompt from the local cache. If the prompt is not found 3141 in the cache or if the cached prompt has expired, it will try to fetch the prompt from the server again 3142 and update the cache. If fetching the new prompt fails, and there is an expired prompt in the cache, it will 3143 return the expired prompt as a fallback. 3144 3145 Args: 3146 name (str): The name of the prompt to retrieve. 3147 3148 Keyword Args: 3149 version (Optional[int]): The version of the prompt to retrieve. If no label and version is specified, the `production` label is returned. Specify either version or label, not both. 3150 label: Optional[str]: The label of the prompt to retrieve. If no label and version is specified, the `production` label is returned. Specify either version or label, not both. 3151 cache_ttl_seconds: Optional[int]: Time-to-live in seconds for caching the prompt. Must be specified as a 3152 keyword argument. If not set, defaults to 60 seconds. Disables caching if set to 0. 3153 type: Literal["chat", "text"]: The type of the prompt to retrieve. Defaults to "text". 3154 fallback: Union[Optional[List[ChatMessageDict]], Optional[str]]: The prompt string to return if fetching the prompt fails. Important on the first call where no cached prompt is available. Follows Langfuse prompt formatting with double curly braces for variables. Defaults to None. 3155 max_retries: Optional[int]: The maximum number of retries in case of API/network errors. Defaults to 2. The maximum value is 4. Retries have an exponential backoff with a maximum delay of 10 seconds. 3156 fetch_timeout_seconds: Optional[int]: The timeout in milliseconds for fetching the prompt. Defaults to the default timeout set on the SDK, which is 5 seconds per default. 3157 3158 Returns: 3159 The prompt object retrieved from the cache or directly fetched if not cached or expired of type 3160 - TextPromptClient, if type argument is 'text'. 3161 - ChatPromptClient, if type argument is 'chat'. 3162 3163 Raises: 3164 Exception: Propagates any exceptions raised during the fetching of a new prompt, unless there is an 3165 expired prompt in the cache, in which case it logs a warning and returns the expired prompt. 3166 """ 3167 if self._resources is None: 3168 raise Error( 3169 "SDK is not correctly initialized. Check the init logs for more details." 3170 ) 3171 if version is not None and label is not None: 3172 raise ValueError("Cannot specify both version and label at the same time.") 3173 3174 if not name: 3175 raise ValueError("Prompt name cannot be empty.") 3176 3177 cache_key = PromptCache.generate_cache_key(name, version=version, label=label) 3178 bounded_max_retries = self._get_bounded_max_retries( 3179 max_retries, default_max_retries=2, max_retries_upper_bound=4 3180 ) 3181 3182 langfuse_logger.debug(f"Getting prompt '{cache_key}'") 3183 cached_prompt = self._resources.prompt_cache.get(cache_key) 3184 3185 if cached_prompt is None or cache_ttl_seconds == 0: 3186 langfuse_logger.debug( 3187 f"Prompt '{cache_key}' not found in cache or caching disabled." 3188 ) 3189 try: 3190 return self._fetch_prompt_and_update_cache( 3191 name, 3192 version=version, 3193 label=label, 3194 ttl_seconds=cache_ttl_seconds, 3195 max_retries=bounded_max_retries, 3196 fetch_timeout_seconds=fetch_timeout_seconds, 3197 ) 3198 except Exception as e: 3199 if fallback: 3200 langfuse_logger.warning( 3201 f"Returning fallback prompt for '{cache_key}' due to fetch error: {e}" 3202 ) 3203 3204 fallback_client_args: Dict[str, Any] = { 3205 "name": name, 3206 "prompt": fallback, 3207 "type": type, 3208 "version": version or 0, 3209 "config": {}, 3210 "labels": [label] if label else [], 3211 "tags": [], 3212 } 3213 3214 if type == "text": 3215 return TextPromptClient( 3216 prompt=Prompt_Text(**fallback_client_args), 3217 is_fallback=True, 3218 ) 3219 3220 if type == "chat": 3221 return ChatPromptClient( 3222 prompt=Prompt_Chat(**fallback_client_args), 3223 is_fallback=True, 3224 ) 3225 3226 raise e 3227 3228 if cached_prompt.is_expired(): 3229 langfuse_logger.debug(f"Stale prompt '{cache_key}' found in cache.") 3230 try: 3231 # refresh prompt in background thread, refresh_prompt deduplicates tasks 3232 langfuse_logger.debug(f"Refreshing prompt '{cache_key}' in background.") 3233 3234 def refresh_task() -> None: 3235 self._fetch_prompt_and_update_cache( 3236 name, 3237 version=version, 3238 label=label, 3239 ttl_seconds=cache_ttl_seconds, 3240 max_retries=bounded_max_retries, 3241 fetch_timeout_seconds=fetch_timeout_seconds, 3242 ) 3243 3244 self._resources.prompt_cache.add_refresh_prompt_task( 3245 cache_key, 3246 refresh_task, 3247 ) 3248 langfuse_logger.debug( 3249 f"Returning stale prompt '{cache_key}' from cache." 3250 ) 3251 # return stale prompt 3252 return cached_prompt.value 3253 3254 except Exception as e: 3255 langfuse_logger.warning( 3256 f"Error when refreshing cached prompt '{cache_key}', returning cached version. Error: {e}" 3257 ) 3258 # creation of refresh prompt task failed, return stale prompt 3259 return cached_prompt.value 3260 3261 return cached_prompt.value
Get a prompt.
This method attempts to fetch the requested prompt from the local cache. If the prompt is not found in the cache or if the cached prompt has expired, it will try to fetch the prompt from the server again and update the cache. If fetching the new prompt fails, and there is an expired prompt in the cache, it will return the expired prompt as a fallback.
Arguments:
- name (str): The name of the prompt to retrieve.
Keyword Args:
version (Optional[int]): The version of the prompt to retrieve. If no label and version is specified, the
productionlabel is returned. Specify either version or label, not both. label: Optional[str]: The label of the prompt to retrieve. If no label and version is specified, theproductionlabel is returned. Specify either version or label, not both. cache_ttl_seconds: Optional[int]: Time-to-live in seconds for caching the prompt. Must be specified as a keyword argument. If not set, defaults to 60 seconds. Disables caching if set to 0. type: Literal["chat", "text"]: The type of the prompt to retrieve. Defaults to "text". fallback: Union[Optional[List[ChatMessageDict]], Optional[str]]: The prompt string to return if fetching the prompt fails. Important on the first call where no cached prompt is available. Follows Langfuse prompt formatting with double curly braces for variables. Defaults to None. max_retries: Optional[int]: The maximum number of retries in case of API/network errors. Defaults to 2. The maximum value is 4. Retries have an exponential backoff with a maximum delay of 10 seconds. fetch_timeout_seconds: Optional[int]: The timeout in milliseconds for fetching the prompt. Defaults to the default timeout set on the SDK, which is 5 seconds per default.
Returns:
The prompt object retrieved from the cache or directly fetched if not cached or expired of type
- TextPromptClient, if type argument is 'text'.
- ChatPromptClient, if type argument is 'chat'.
Raises:
- Exception: Propagates any exceptions raised during the fetching of a new prompt, unless there is an
- expired prompt in the cache, in which case it logs a warning and returns the expired prompt.
3355 def create_prompt( 3356 self, 3357 *, 3358 name: str, 3359 prompt: Union[ 3360 str, List[Union[ChatMessageDict, ChatMessageWithPlaceholdersDict]] 3361 ], 3362 labels: List[str] = [], 3363 tags: Optional[List[str]] = None, 3364 type: Optional[Literal["chat", "text"]] = "text", 3365 config: Optional[Any] = None, 3366 commit_message: Optional[str] = None, 3367 ) -> PromptClient: 3368 """Create a new prompt in Langfuse. 3369 3370 Keyword Args: 3371 name : The name of the prompt to be created. 3372 prompt : The content of the prompt to be created. 3373 is_active [DEPRECATED] : A flag indicating whether the prompt is active or not. This is deprecated and will be removed in a future release. Please use the 'production' label instead. 3374 labels: The labels of the prompt. Defaults to None. To create a default-served prompt, add the 'production' label. 3375 tags: The tags of the prompt. Defaults to None. Will be applied to all versions of the prompt. 3376 config: Additional structured data to be saved with the prompt. Defaults to None. 3377 type: The type of the prompt to be created. "chat" vs. "text". Defaults to "text". 3378 commit_message: Optional string describing the change. 3379 3380 Returns: 3381 TextPromptClient: The prompt if type argument is 'text'. 3382 ChatPromptClient: The prompt if type argument is 'chat'. 3383 """ 3384 try: 3385 langfuse_logger.debug(f"Creating prompt {name=}, {labels=}") 3386 3387 if type == "chat": 3388 if not isinstance(prompt, list): 3389 raise ValueError( 3390 "For 'chat' type, 'prompt' must be a list of chat messages with role and content attributes." 3391 ) 3392 request: Union[CreatePromptRequest_Chat, CreatePromptRequest_Text] = ( 3393 CreatePromptRequest_Chat( 3394 name=name, 3395 prompt=cast(Any, prompt), 3396 labels=labels, 3397 tags=tags, 3398 config=config or {}, 3399 commitMessage=commit_message, 3400 type="chat", 3401 ) 3402 ) 3403 server_prompt = self.api.prompts.create(request=request) 3404 3405 if self._resources is not None: 3406 self._resources.prompt_cache.invalidate(name) 3407 3408 return ChatPromptClient(prompt=cast(Prompt_Chat, server_prompt)) 3409 3410 if not isinstance(prompt, str): 3411 raise ValueError("For 'text' type, 'prompt' must be a string.") 3412 3413 request = CreatePromptRequest_Text( 3414 name=name, 3415 prompt=prompt, 3416 labels=labels, 3417 tags=tags, 3418 config=config or {}, 3419 commitMessage=commit_message, 3420 type="text", 3421 ) 3422 3423 server_prompt = self.api.prompts.create(request=request) 3424 3425 if self._resources is not None: 3426 self._resources.prompt_cache.invalidate(name) 3427 3428 return TextPromptClient(prompt=cast(Prompt_Text, server_prompt)) 3429 3430 except Error as e: 3431 handle_fern_exception(e) 3432 raise e
Create a new prompt in Langfuse.
Keyword Args:
name : The name of the prompt to be created. prompt : The content of the prompt to be created. is_active [DEPRECATED] : A flag indicating whether the prompt is active or not. This is deprecated and will be removed in a future release. Please use the 'production' label instead. labels: The labels of the prompt. Defaults to None. To create a default-served prompt, add the 'production' label. tags: The tags of the prompt. Defaults to None. Will be applied to all versions of the prompt. config: Additional structured data to be saved with the prompt. Defaults to None. type: The type of the prompt to be created. "chat" vs. "text". Defaults to "text". commit_message: Optional string describing the change.
Returns:
TextPromptClient: The prompt if type argument is 'text'. ChatPromptClient: The prompt if type argument is 'chat'.
3434 def update_prompt( 3435 self, 3436 *, 3437 name: str, 3438 version: int, 3439 new_labels: List[str] = [], 3440 ) -> Any: 3441 """Update an existing prompt version in Langfuse. The Langfuse SDK prompt cache is invalidated for all prompts witht he specified name. 3442 3443 Args: 3444 name (str): The name of the prompt to update. 3445 version (int): The version number of the prompt to update. 3446 new_labels (List[str], optional): New labels to assign to the prompt version. Labels are unique across versions. The "latest" label is reserved and managed by Langfuse. Defaults to []. 3447 3448 Returns: 3449 Prompt: The updated prompt from the Langfuse API. 3450 3451 """ 3452 updated_prompt = self.api.prompt_version.update( 3453 name=self._url_encode(name), 3454 version=version, 3455 new_labels=new_labels, 3456 ) 3457 3458 if self._resources is not None: 3459 self._resources.prompt_cache.invalidate(name) 3460 3461 return updated_prompt
Update an existing prompt version in Langfuse. The Langfuse SDK prompt cache is invalidated for all prompts witht he specified name.
Arguments:
- name (str): The name of the prompt to update.
- version (int): The version number of the prompt to update.
- new_labels (List[str], optional): New labels to assign to the prompt version. Labels are unique across versions. The "latest" label is reserved and managed by Langfuse. Defaults to [].
Returns:
Prompt: The updated prompt from the Langfuse API.
3476 def clear_prompt_cache(self) -> None: 3477 """Clear the entire prompt cache, removing all cached prompts. 3478 3479 This method is useful when you want to force a complete refresh of all 3480 cached prompts, for example after major updates or when you need to 3481 ensure the latest versions are fetched from the server. 3482 """ 3483 if self._resources is not None: 3484 self._resources.prompt_cache.clear()
Clear the entire prompt cache, removing all cached prompts.
This method is useful when you want to force a complete refresh of all cached prompts, for example after major updates or when you need to ensure the latest versions are fetched from the server.
59def get_client(*, public_key: Optional[str] = None) -> Langfuse: 60 """Get or create a Langfuse client instance. 61 62 Returns an existing Langfuse client or creates a new one if none exists. In multi-project setups, 63 providing a public_key is required. Multi-project support is experimental - see Langfuse docs. 64 65 Behavior: 66 - Single project: Returns existing client or creates new one 67 - Multi-project: Requires public_key to return specific client 68 - No public_key in multi-project: Returns disabled client to prevent data leakage 69 70 The function uses a singleton pattern per public_key to conserve resources and maintain state. 71 72 Args: 73 public_key (Optional[str]): Project identifier 74 - With key: Returns client for that project 75 - Without key: Returns single client or disabled client if multiple exist 76 77 Returns: 78 Langfuse: Client instance in one of three states: 79 1. Client for specified public_key 80 2. Default client for single-project setup 81 3. Disabled client when multiple projects exist without key 82 83 Security: 84 Disables tracing when multiple projects exist without explicit key to prevent 85 cross-project data leakage. Multi-project setups are experimental. 86 87 Example: 88 ```python 89 # Single project 90 client = get_client() # Default client 91 92 # In multi-project usage: 93 client_a = get_client(public_key="project_a_key") # Returns project A's client 94 client_b = get_client(public_key="project_b_key") # Returns project B's client 95 96 # Without specific key in multi-project setup: 97 client = get_client() # Returns disabled client for safety 98 ``` 99 """ 100 with LangfuseResourceManager._lock: 101 active_instances = LangfuseResourceManager._instances 102 103 # If no explicit public_key provided, check execution context 104 if not public_key: 105 public_key = _current_public_key.get(None) 106 107 if not public_key: 108 if len(active_instances) == 0: 109 # No clients initialized yet, create default instance 110 return Langfuse() 111 112 if len(active_instances) == 1: 113 # Only one client exists, safe to use without specifying key 114 instance = list(active_instances.values())[0] 115 116 # Initialize with the credentials bound to the instance 117 # This is important if the original instance was instantiated 118 # via constructor arguments 119 return _create_client_from_instance(instance) 120 121 else: 122 # Multiple clients exist but no key specified - disable tracing 123 # to prevent cross-project data leakage 124 langfuse_logger.warning( 125 "No 'langfuse_public_key' passed to decorated function, but multiple langfuse clients are instantiated in current process. Skipping tracing for this function to avoid cross-project leakage." 126 ) 127 return Langfuse( 128 tracing_enabled=False, public_key="fake", secret_key="fake" 129 ) 130 131 else: 132 # Specific key provided, look up existing instance 133 target_instance: Optional[LangfuseResourceManager] = active_instances.get( 134 public_key, None 135 ) 136 137 if target_instance is None: 138 # No instance found with this key - client not initialized properly 139 langfuse_logger.warning( 140 f"No Langfuse client with public key {public_key} has been initialized. Skipping tracing for decorated function." 141 ) 142 return Langfuse( 143 tracing_enabled=False, public_key="fake", secret_key="fake" 144 ) 145 146 # target_instance is guaranteed to be not None at this point 147 return _create_client_from_instance(target_instance, public_key)
Get or create a Langfuse client instance.
Returns an existing Langfuse client or creates a new one if none exists. In multi-project setups, providing a public_key is required. Multi-project support is experimental - see Langfuse docs.
Behavior:
- Single project: Returns existing client or creates new one
- Multi-project: Requires public_key to return specific client
- No public_key in multi-project: Returns disabled client to prevent data leakage
The function uses a singleton pattern per public_key to conserve resources and maintain state.
Arguments:
- public_key (Optional[str]): Project identifier
- With key: Returns client for that project
- Without key: Returns single client or disabled client if multiple exist
Returns:
Langfuse: Client instance in one of three states: 1. Client for specified public_key 2. Default client for single-project setup 3. Disabled client when multiple projects exist without key
Security:
Disables tracing when multiple projects exist without explicit key to prevent cross-project data leakage. Multi-project setups are experimental.
Example:
# Single project client = get_client() # Default client # In multi-project usage: client_a = get_client(public_key="project_a_key") # Returns project A's client client_b = get_client(public_key="project_b_key") # Returns project B's client # Without specific key in multi-project setup: client = get_client() # Returns disabled client for safety
90 def observe( 91 self, 92 func: Optional[F] = None, 93 *, 94 name: Optional[str] = None, 95 as_type: Optional[ObservationTypeLiteralNoEvent] = None, 96 capture_input: Optional[bool] = None, 97 capture_output: Optional[bool] = None, 98 transform_to_string: Optional[Callable[[Iterable], str]] = None, 99 ) -> Union[F, Callable[[F], F]]: 100 """Wrap a function to create and manage Langfuse tracing around its execution, supporting both synchronous and asynchronous functions. 101 102 This decorator provides seamless integration of Langfuse observability into your codebase. It automatically creates 103 spans or generations around function execution, capturing timing, inputs/outputs, and error states. The decorator 104 intelligently handles both synchronous and asynchronous functions, preserving function signatures and type hints. 105 106 Using OpenTelemetry's distributed tracing system, it maintains proper trace context propagation throughout your application, 107 enabling you to see hierarchical traces of function calls with detailed performance metrics and function-specific details. 108 109 Args: 110 func (Optional[Callable]): The function to decorate. When used with parentheses @observe(), this will be None. 111 name (Optional[str]): Custom name for the created trace or span. If not provided, the function name is used. 112 as_type (Optional[Literal]): Set the observation type. Supported values: 113 "generation", "span", "agent", "tool", "chain", "retriever", "embedding", "evaluator", "guardrail". 114 Observation types are highlighted in the Langfuse UI for filtering and visualization. 115 The types "generation" and "embedding" create a span on which additional attributes such as model metrics 116 can be set. 117 118 Returns: 119 Callable: A wrapped version of the original function that automatically creates and manages Langfuse spans. 120 121 Example: 122 For general function tracing with automatic naming: 123 ```python 124 @observe() 125 def process_user_request(user_id, query): 126 # Function is automatically traced with name "process_user_request" 127 return get_response(query) 128 ``` 129 130 For language model generation tracking: 131 ```python 132 @observe(name="answer-generation", as_type="generation") 133 async def generate_answer(query): 134 # Creates a generation-type span with extended LLM metrics 135 response = await openai.chat.completions.create( 136 model="gpt-4", 137 messages=[{"role": "user", "content": query}] 138 ) 139 return response.choices[0].message.content 140 ``` 141 142 For trace context propagation between functions: 143 ```python 144 @observe() 145 def main_process(): 146 # Parent span is created 147 return sub_process() # Child span automatically connected to parent 148 149 @observe() 150 def sub_process(): 151 # Automatically becomes a child span of main_process 152 return "result" 153 ``` 154 155 Raises: 156 Exception: Propagates any exceptions from the wrapped function after logging them in the trace. 157 158 Notes: 159 - The decorator preserves the original function's signature, docstring, and return type. 160 - Proper parent-child relationships between spans are automatically maintained. 161 - Special keyword arguments can be passed to control tracing: 162 - langfuse_trace_id: Explicitly set the trace ID for this function call 163 - langfuse_parent_observation_id: Explicitly set the parent span ID 164 - langfuse_public_key: Use a specific Langfuse project (when multiple clients exist) 165 - For async functions, the decorator returns an async function wrapper. 166 - For sync functions, the decorator returns a synchronous wrapper. 167 """ 168 valid_types = set(get_observation_types_list(ObservationTypeLiteralNoEvent)) 169 if as_type is not None and as_type not in valid_types: 170 self._log.warning( 171 f"Invalid as_type '{as_type}'. Valid types are: {', '.join(sorted(valid_types))}. Defaulting to 'span'." 172 ) 173 as_type = "span" 174 175 function_io_capture_enabled = os.environ.get( 176 LANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED, "True" 177 ).lower() not in ("false", "0") 178 179 should_capture_input = ( 180 capture_input if capture_input is not None else function_io_capture_enabled 181 ) 182 183 should_capture_output = ( 184 capture_output 185 if capture_output is not None 186 else function_io_capture_enabled 187 ) 188 189 def decorator(func: F) -> F: 190 return ( 191 self._async_observe( 192 func, 193 name=name, 194 as_type=as_type, 195 capture_input=should_capture_input, 196 capture_output=should_capture_output, 197 transform_to_string=transform_to_string, 198 ) 199 if asyncio.iscoroutinefunction(func) 200 else self._sync_observe( 201 func, 202 name=name, 203 as_type=as_type, 204 capture_input=should_capture_input, 205 capture_output=should_capture_output, 206 transform_to_string=transform_to_string, 207 ) 208 ) 209 210 """Handle decorator with or without parentheses. 211 212 This logic enables the decorator to work both with and without parentheses: 213 - @observe - Python passes the function directly to the decorator 214 - @observe() - Python calls the decorator first, which must return a function decorator 215 216 When called without arguments (@observe), the func parameter contains the function to decorate, 217 so we directly apply the decorator to it. When called with parentheses (@observe()), 218 func is None, so we return the decorator function itself for Python to apply in the next step. 219 """ 220 if func is None: 221 return decorator 222 else: 223 return decorator(func)
Wrap a function to create and manage Langfuse tracing around its execution, supporting both synchronous and asynchronous functions.
This decorator provides seamless integration of Langfuse observability into your codebase. It automatically creates spans or generations around function execution, capturing timing, inputs/outputs, and error states. The decorator intelligently handles both synchronous and asynchronous functions, preserving function signatures and type hints.
Using OpenTelemetry's distributed tracing system, it maintains proper trace context propagation throughout your application, enabling you to see hierarchical traces of function calls with detailed performance metrics and function-specific details.
Arguments:
- func (Optional[Callable]): The function to decorate. When used with parentheses @observe(), this will be None.
- name (Optional[str]): Custom name for the created trace or span. If not provided, the function name is used.
- as_type (Optional[Literal]): Set the observation type. Supported values: "generation", "span", "agent", "tool", "chain", "retriever", "embedding", "evaluator", "guardrail". Observation types are highlighted in the Langfuse UI for filtering and visualization. The types "generation" and "embedding" create a span on which additional attributes such as model metrics can be set.
Returns:
Callable: A wrapped version of the original function that automatically creates and manages Langfuse spans.
Example:
For general function tracing with automatic naming:
@observe() def process_user_request(user_id, query): # Function is automatically traced with name "process_user_request" return get_response(query)For language model generation tracking:
@observe(name="answer-generation", as_type="generation") async def generate_answer(query): # Creates a generation-type span with extended LLM metrics response = await openai.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": query}] ) return response.choices[0].message.contentFor trace context propagation between functions:
@observe() def main_process(): # Parent span is created return sub_process() # Child span automatically connected to parent @observe() def sub_process(): # Automatically becomes a child span of main_process return "result"
Raises:
- Exception: Propagates any exceptions from the wrapped function after logging them in the trace.
Notes:
- The decorator preserves the original function's signature, docstring, and return type.
- Proper parent-child relationships between spans are automatically maintained.
- Special keyword arguments can be passed to control tracing:
- langfuse_trace_id: Explicitly set the trace ID for this function call
- langfuse_parent_observation_id: Explicitly set the parent span ID
- langfuse_public_key: Use a specific Langfuse project (when multiple clients exist)
- For async functions, the decorator returns an async function wrapper.
- For sync functions, the decorator returns a synchronous wrapper.
74def propagate_attributes( 75 *, 76 user_id: Optional[str] = None, 77 session_id: Optional[str] = None, 78 metadata: Optional[Dict[str, str]] = None, 79 version: Optional[str] = None, 80 tags: Optional[List[str]] = None, 81 as_baggage: bool = False, 82) -> _AgnosticContextManager[Any]: 83 """Propagate trace-level attributes to all spans created within this context. 84 85 This context manager sets attributes on the currently active span AND automatically 86 propagates them to all new child spans created within the context. This is the 87 recommended way to set trace-level attributes like user_id, session_id, and metadata 88 dimensions that should be consistently applied across all observations in a trace. 89 90 **IMPORTANT**: Call this as early as possible within your trace/workflow. Only the 91 currently active span and spans created after entering this context will have these 92 attributes. Pre-existing spans will NOT be retroactively updated. 93 94 **Why this matters**: Langfuse aggregation queries (e.g., total cost by user_id, 95 filtering by session_id) only include observations that have the attribute set. 96 If you call `propagate_attributes` late in your workflow, earlier spans won't be 97 included in aggregations for that attribute. 98 99 Args: 100 user_id: User identifier to associate with all spans in this context. 101 Must be US-ASCII string, ≤200 characters. Use this to track which user 102 generated each trace and enable e.g. per-user cost/performance analysis. 103 session_id: Session identifier to associate with all spans in this context. 104 Must be US-ASCII string, ≤200 characters. Use this to group related traces 105 within a user session (e.g., a conversation thread, multi-turn interaction). 106 metadata: Additional key-value metadata to propagate to all spans. 107 - Keys and values must be US-ASCII strings 108 - All values must be ≤200 characters 109 - Use for dimensions like internal correlating identifiers 110 - AVOID: large payloads, sensitive data, non-string values (will be dropped with warning) 111 version: Version identfier for parts of your application that are independently versioned, e.g. agents 112 tags: List of tags to categorize the group of observations 113 as_baggage: If True, propagates attributes using OpenTelemetry baggage for 114 cross-process/service propagation. **Security warning**: When enabled, 115 attribute values are added to HTTP headers on ALL outbound requests. 116 Only enable if values are safe to transmit via HTTP headers and you need 117 cross-service tracing. Default: False. 118 119 Returns: 120 Context manager that propagates attributes to all child spans. 121 122 Example: 123 Basic usage with user and session tracking: 124 125 ```python 126 from langfuse import Langfuse 127 128 langfuse = Langfuse() 129 130 # Set attributes early in the trace 131 with langfuse.start_as_current_span(name="user_workflow") as span: 132 with langfuse.propagate_attributes( 133 user_id="user_123", 134 session_id="session_abc", 135 metadata={"experiment": "variant_a", "environment": "production"} 136 ): 137 # All spans created here will have user_id, session_id, and metadata 138 with langfuse.start_span(name="llm_call") as llm_span: 139 # This span inherits: user_id, session_id, experiment, environment 140 ... 141 142 with langfuse.start_generation(name="completion") as gen: 143 # This span also inherits all attributes 144 ... 145 ``` 146 147 Late propagation (anti-pattern): 148 149 ```python 150 with langfuse.start_as_current_span(name="workflow") as span: 151 # These spans WON'T have user_id 152 early_span = langfuse.start_span(name="early_work") 153 early_span.end() 154 155 # Set attributes in the middle 156 with langfuse.propagate_attributes(user_id="user_123"): 157 # Only spans created AFTER this point will have user_id 158 late_span = langfuse.start_span(name="late_work") 159 late_span.end() 160 161 # Result: Aggregations by user_id will miss "early_work" span 162 ``` 163 164 Cross-service propagation with baggage (advanced): 165 166 ```python 167 # Service A - originating service 168 with langfuse.start_as_current_span(name="api_request"): 169 with langfuse.propagate_attributes( 170 user_id="user_123", 171 session_id="session_abc", 172 as_baggage=True # Propagate via HTTP headers 173 ): 174 # Make HTTP request to Service B 175 response = requests.get("https://service-b.example.com/api") 176 # user_id and session_id are now in HTTP headers 177 178 # Service B - downstream service 179 # OpenTelemetry will automatically extract baggage from HTTP headers 180 # and propagate to spans in Service B 181 ``` 182 183 Note: 184 - **Validation**: All attribute values (user_id, session_id, metadata values) 185 must be strings ≤200 characters. Invalid values will be dropped with a 186 warning logged. Ensure values meet constraints before calling. 187 - **OpenTelemetry**: This uses OpenTelemetry context propagation under the hood, 188 making it compatible with other OTel-instrumented libraries. 189 190 Raises: 191 No exceptions are raised. Invalid values are logged as warnings and dropped. 192 """ 193 return _propagate_attributes( 194 user_id=user_id, 195 session_id=session_id, 196 metadata=metadata, 197 version=version, 198 tags=tags, 199 as_baggage=as_baggage, 200 )
Propagate trace-level attributes to all spans created within this context.
This context manager sets attributes on the currently active span AND automatically propagates them to all new child spans created within the context. This is the recommended way to set trace-level attributes like user_id, session_id, and metadata dimensions that should be consistently applied across all observations in a trace.
IMPORTANT: Call this as early as possible within your trace/workflow. Only the currently active span and spans created after entering this context will have these attributes. Pre-existing spans will NOT be retroactively updated.
Why this matters: Langfuse aggregation queries (e.g., total cost by user_id,
filtering by session_id) only include observations that have the attribute set.
If you call propagate_attributes late in your workflow, earlier spans won't be
included in aggregations for that attribute.
Arguments:
- user_id: User identifier to associate with all spans in this context. Must be US-ASCII string, ≤200 characters. Use this to track which user generated each trace and enable e.g. per-user cost/performance analysis.
- session_id: Session identifier to associate with all spans in this context. Must be US-ASCII string, ≤200 characters. Use this to group related traces within a user session (e.g., a conversation thread, multi-turn interaction).
- metadata: Additional key-value metadata to propagate to all spans.
- Keys and values must be US-ASCII strings
- All values must be ≤200 characters
- Use for dimensions like internal correlating identifiers
- AVOID: large payloads, sensitive data, non-string values (will be dropped with warning)
- version: Version identfier for parts of your application that are independently versioned, e.g. agents
- tags: List of tags to categorize the group of observations
- as_baggage: If True, propagates attributes using OpenTelemetry baggage for cross-process/service propagation. Security warning: When enabled, attribute values are added to HTTP headers on ALL outbound requests. Only enable if values are safe to transmit via HTTP headers and you need cross-service tracing. Default: False.
Returns:
Context manager that propagates attributes to all child spans.
Example:
Basic usage with user and session tracking:
from langfuse import Langfuse langfuse = Langfuse() # Set attributes early in the trace with langfuse.start_as_current_span(name="user_workflow") as span: with langfuse.propagate_attributes( user_id="user_123", session_id="session_abc", metadata={"experiment": "variant_a", "environment": "production"} ): # All spans created here will have user_id, session_id, and metadata with langfuse.start_span(name="llm_call") as llm_span: # This span inherits: user_id, session_id, experiment, environment ... with langfuse.start_generation(name="completion") as gen: # This span also inherits all attributes ...Late propagation (anti-pattern):
with langfuse.start_as_current_span(name="workflow") as span: # These spans WON'T have user_id early_span = langfuse.start_span(name="early_work") early_span.end() # Set attributes in the middle with langfuse.propagate_attributes(user_id="user_123"): # Only spans created AFTER this point will have user_id late_span = langfuse.start_span(name="late_work") late_span.end() # Result: Aggregations by user_id will miss "early_work" spanCross-service propagation with baggage (advanced):
# Service A - originating service with langfuse.start_as_current_span(name="api_request"): with langfuse.propagate_attributes( user_id="user_123", session_id="session_abc", as_baggage=True # Propagate via HTTP headers ): # Make HTTP request to Service B response = requests.get("https://service-b.example.com/api") # user_id and session_id are now in HTTP headers # Service B - downstream service # OpenTelemetry will automatically extract baggage from HTTP headers # and propagate to spans in Service B
Note:
- Validation: All attribute values (user_id, session_id, metadata values) must be strings ≤200 characters. Invalid values will be dropped with a warning logged. Ensure values meet constraints before calling.
- OpenTelemetry: This uses OpenTelemetry context propagation under the hood, making it compatible with other OTel-instrumented libraries.
Raises:
- No exceptions are raised. Invalid values are logged as warnings and dropped.
1146class LangfuseSpan(LangfuseObservationWrapper): 1147 """Standard span implementation for general operations in Langfuse. 1148 1149 This class represents a general-purpose span that can be used to trace 1150 any operation in your application. It extends the base LangfuseObservationWrapper 1151 with specific methods for creating child spans, generations, and updating 1152 span-specific attributes. If possible, use a more specific type for 1153 better observability and insights. 1154 """ 1155 1156 def __init__( 1157 self, 1158 *, 1159 otel_span: otel_trace_api.Span, 1160 langfuse_client: "Langfuse", 1161 input: Optional[Any] = None, 1162 output: Optional[Any] = None, 1163 metadata: Optional[Any] = None, 1164 environment: Optional[str] = None, 1165 version: Optional[str] = None, 1166 level: Optional[SpanLevel] = None, 1167 status_message: Optional[str] = None, 1168 ): 1169 """Initialize a new LangfuseSpan. 1170 1171 Args: 1172 otel_span: The OpenTelemetry span to wrap 1173 langfuse_client: Reference to the parent Langfuse client 1174 input: Input data for the span (any JSON-serializable object) 1175 output: Output data from the span (any JSON-serializable object) 1176 metadata: Additional metadata to associate with the span 1177 environment: The tracing environment 1178 version: Version identifier for the code or component 1179 level: Importance level of the span (info, warning, error) 1180 status_message: Optional status message for the span 1181 """ 1182 super().__init__( 1183 otel_span=otel_span, 1184 as_type="span", 1185 langfuse_client=langfuse_client, 1186 input=input, 1187 output=output, 1188 metadata=metadata, 1189 environment=environment, 1190 version=version, 1191 level=level, 1192 status_message=status_message, 1193 ) 1194 1195 def start_span( 1196 self, 1197 name: str, 1198 input: Optional[Any] = None, 1199 output: Optional[Any] = None, 1200 metadata: Optional[Any] = None, 1201 version: Optional[str] = None, 1202 level: Optional[SpanLevel] = None, 1203 status_message: Optional[str] = None, 1204 ) -> "LangfuseSpan": 1205 """Create a new child span. 1206 1207 This method creates a new child span with this span as the parent. 1208 Unlike start_as_current_span(), this method does not set the new span 1209 as the current span in the context. 1210 1211 Args: 1212 name: Name of the span (e.g., function or operation name) 1213 input: Input data for the operation 1214 output: Output data from the operation 1215 metadata: Additional metadata to associate with the span 1216 version: Version identifier for the code or component 1217 level: Importance level of the span (info, warning, error) 1218 status_message: Optional status message for the span 1219 1220 Returns: 1221 A new LangfuseSpan that must be ended with .end() when complete 1222 1223 Example: 1224 ```python 1225 parent_span = langfuse.start_span(name="process-request") 1226 try: 1227 # Create a child span 1228 child_span = parent_span.start_span(name="validate-input") 1229 try: 1230 # Do validation work 1231 validation_result = validate(request_data) 1232 child_span.update(output=validation_result) 1233 finally: 1234 child_span.end() 1235 1236 # Continue with parent span 1237 result = process_validated_data(validation_result) 1238 parent_span.update(output=result) 1239 finally: 1240 parent_span.end() 1241 ``` 1242 """ 1243 return self.start_observation( 1244 name=name, 1245 as_type="span", 1246 input=input, 1247 output=output, 1248 metadata=metadata, 1249 version=version, 1250 level=level, 1251 status_message=status_message, 1252 ) 1253 1254 def start_as_current_span( 1255 self, 1256 *, 1257 name: str, 1258 input: Optional[Any] = None, 1259 output: Optional[Any] = None, 1260 metadata: Optional[Any] = None, 1261 version: Optional[str] = None, 1262 level: Optional[SpanLevel] = None, 1263 status_message: Optional[str] = None, 1264 ) -> _AgnosticContextManager["LangfuseSpan"]: 1265 """[DEPRECATED] Create a new child span and set it as the current span in a context manager. 1266 1267 DEPRECATED: This method is deprecated and will be removed in a future version. 1268 Use start_as_current_observation(as_type='span') instead. 1269 1270 This method creates a new child span and sets it as the current span within 1271 a context manager. It should be used with a 'with' statement to automatically 1272 manage the span's lifecycle. 1273 1274 Args: 1275 name: Name of the span (e.g., function or operation name) 1276 input: Input data for the operation 1277 output: Output data from the operation 1278 metadata: Additional metadata to associate with the span 1279 version: Version identifier for the code or component 1280 level: Importance level of the span (info, warning, error) 1281 status_message: Optional status message for the span 1282 1283 Returns: 1284 A context manager that yields a new LangfuseSpan 1285 1286 Example: 1287 ```python 1288 with langfuse.start_as_current_span(name="process-request") as parent_span: 1289 # Parent span is active here 1290 1291 # Create a child span with context management 1292 with parent_span.start_as_current_span(name="validate-input") as child_span: 1293 # Child span is active here 1294 validation_result = validate(request_data) 1295 child_span.update(output=validation_result) 1296 1297 # Back to parent span context 1298 result = process_validated_data(validation_result) 1299 parent_span.update(output=result) 1300 ``` 1301 """ 1302 warnings.warn( 1303 "start_as_current_span is deprecated and will be removed in a future version. " 1304 "Use start_as_current_observation(as_type='span') instead.", 1305 DeprecationWarning, 1306 stacklevel=2, 1307 ) 1308 return self.start_as_current_observation( 1309 name=name, 1310 as_type="span", 1311 input=input, 1312 output=output, 1313 metadata=metadata, 1314 version=version, 1315 level=level, 1316 status_message=status_message, 1317 ) 1318 1319 def start_generation( 1320 self, 1321 *, 1322 name: str, 1323 input: Optional[Any] = None, 1324 output: Optional[Any] = None, 1325 metadata: Optional[Any] = None, 1326 version: Optional[str] = None, 1327 level: Optional[SpanLevel] = None, 1328 status_message: Optional[str] = None, 1329 completion_start_time: Optional[datetime] = None, 1330 model: Optional[str] = None, 1331 model_parameters: Optional[Dict[str, MapValue]] = None, 1332 usage_details: Optional[Dict[str, int]] = None, 1333 cost_details: Optional[Dict[str, float]] = None, 1334 prompt: Optional[PromptClient] = None, 1335 ) -> "LangfuseGeneration": 1336 """[DEPRECATED] Create a new child generation span. 1337 1338 DEPRECATED: This method is deprecated and will be removed in a future version. 1339 Use start_observation(as_type='generation') instead. 1340 1341 This method creates a new child generation span with this span as the parent. 1342 Generation spans are specialized for AI/LLM operations and include additional 1343 fields for model information, usage stats, and costs. 1344 1345 Unlike start_as_current_generation(), this method does not set the new span 1346 as the current span in the context. 1347 1348 Args: 1349 name: Name of the generation operation 1350 input: Input data for the model (e.g., prompts) 1351 output: Output from the model (e.g., completions) 1352 metadata: Additional metadata to associate with the generation 1353 version: Version identifier for the model or component 1354 level: Importance level of the generation (info, warning, error) 1355 status_message: Optional status message for the generation 1356 completion_start_time: When the model started generating the response 1357 model: Name/identifier of the AI model used (e.g., "gpt-4") 1358 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 1359 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 1360 cost_details: Cost information for the model call 1361 prompt: Associated prompt template from Langfuse prompt management 1362 1363 Returns: 1364 A new LangfuseGeneration that must be ended with .end() when complete 1365 1366 Example: 1367 ```python 1368 span = langfuse.start_span(name="process-query") 1369 try: 1370 # Create a generation child span 1371 generation = span.start_generation( 1372 name="generate-answer", 1373 model="gpt-4", 1374 input={"prompt": "Explain quantum computing"} 1375 ) 1376 try: 1377 # Call model API 1378 response = llm.generate(...) 1379 1380 generation.update( 1381 output=response.text, 1382 usage_details={ 1383 "prompt_tokens": response.usage.prompt_tokens, 1384 "completion_tokens": response.usage.completion_tokens 1385 } 1386 ) 1387 finally: 1388 generation.end() 1389 1390 # Continue with parent span 1391 span.update(output={"answer": response.text, "source": "gpt-4"}) 1392 finally: 1393 span.end() 1394 ``` 1395 """ 1396 warnings.warn( 1397 "start_generation is deprecated and will be removed in a future version. " 1398 "Use start_observation(as_type='generation') instead.", 1399 DeprecationWarning, 1400 stacklevel=2, 1401 ) 1402 return self.start_observation( 1403 name=name, 1404 as_type="generation", 1405 input=input, 1406 output=output, 1407 metadata=metadata, 1408 version=version, 1409 level=level, 1410 status_message=status_message, 1411 completion_start_time=completion_start_time, 1412 model=model, 1413 model_parameters=model_parameters, 1414 usage_details=usage_details, 1415 cost_details=cost_details, 1416 prompt=prompt, 1417 ) 1418 1419 def start_as_current_generation( 1420 self, 1421 *, 1422 name: str, 1423 input: Optional[Any] = None, 1424 output: Optional[Any] = None, 1425 metadata: Optional[Any] = None, 1426 version: Optional[str] = None, 1427 level: Optional[SpanLevel] = None, 1428 status_message: Optional[str] = None, 1429 completion_start_time: Optional[datetime] = None, 1430 model: Optional[str] = None, 1431 model_parameters: Optional[Dict[str, MapValue]] = None, 1432 usage_details: Optional[Dict[str, int]] = None, 1433 cost_details: Optional[Dict[str, float]] = None, 1434 prompt: Optional[PromptClient] = None, 1435 ) -> _AgnosticContextManager["LangfuseGeneration"]: 1436 """[DEPRECATED] Create a new child generation span and set it as the current span in a context manager. 1437 1438 DEPRECATED: This method is deprecated and will be removed in a future version. 1439 Use start_as_current_observation(as_type='generation') instead. 1440 1441 This method creates a new child generation span and sets it as the current span 1442 within a context manager. Generation spans are specialized for AI/LLM operations 1443 and include additional fields for model information, usage stats, and costs. 1444 1445 Args: 1446 name: Name of the generation operation 1447 input: Input data for the model (e.g., prompts) 1448 output: Output from the model (e.g., completions) 1449 metadata: Additional metadata to associate with the generation 1450 version: Version identifier for the model or component 1451 level: Importance level of the generation (info, warning, error) 1452 status_message: Optional status message for the generation 1453 completion_start_time: When the model started generating the response 1454 model: Name/identifier of the AI model used (e.g., "gpt-4") 1455 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 1456 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 1457 cost_details: Cost information for the model call 1458 prompt: Associated prompt template from Langfuse prompt management 1459 1460 Returns: 1461 A context manager that yields a new LangfuseGeneration 1462 1463 Example: 1464 ```python 1465 with langfuse.start_as_current_span(name="process-request") as span: 1466 # Prepare data 1467 query = preprocess_user_query(user_input) 1468 1469 # Create a generation span with context management 1470 with span.start_as_current_generation( 1471 name="generate-answer", 1472 model="gpt-4", 1473 input={"query": query} 1474 ) as generation: 1475 # Generation span is active here 1476 response = llm.generate(query) 1477 1478 # Update with results 1479 generation.update( 1480 output=response.text, 1481 usage_details={ 1482 "prompt_tokens": response.usage.prompt_tokens, 1483 "completion_tokens": response.usage.completion_tokens 1484 } 1485 ) 1486 1487 # Back to parent span context 1488 span.update(output={"answer": response.text, "source": "gpt-4"}) 1489 ``` 1490 """ 1491 warnings.warn( 1492 "start_as_current_generation is deprecated and will be removed in a future version. " 1493 "Use start_as_current_observation(as_type='generation') instead.", 1494 DeprecationWarning, 1495 stacklevel=2, 1496 ) 1497 return self.start_as_current_observation( 1498 name=name, 1499 as_type="generation", 1500 input=input, 1501 output=output, 1502 metadata=metadata, 1503 version=version, 1504 level=level, 1505 status_message=status_message, 1506 completion_start_time=completion_start_time, 1507 model=model, 1508 model_parameters=model_parameters, 1509 usage_details=usage_details, 1510 cost_details=cost_details, 1511 prompt=prompt, 1512 ) 1513 1514 def create_event( 1515 self, 1516 *, 1517 name: str, 1518 input: Optional[Any] = None, 1519 output: Optional[Any] = None, 1520 metadata: Optional[Any] = None, 1521 version: Optional[str] = None, 1522 level: Optional[SpanLevel] = None, 1523 status_message: Optional[str] = None, 1524 ) -> "LangfuseEvent": 1525 """Create a new Langfuse observation of type 'EVENT'. 1526 1527 Args: 1528 name: Name of the span (e.g., function or operation name) 1529 input: Input data for the operation (can be any JSON-serializable object) 1530 output: Output data from the operation (can be any JSON-serializable object) 1531 metadata: Additional metadata to associate with the span 1532 version: Version identifier for the code or component 1533 level: Importance level of the span (info, warning, error) 1534 status_message: Optional status message for the span 1535 1536 Returns: 1537 The LangfuseEvent object 1538 1539 Example: 1540 ```python 1541 event = langfuse.create_event(name="process-event") 1542 ``` 1543 """ 1544 timestamp = time_ns() 1545 1546 with otel_trace_api.use_span(self._otel_span): 1547 new_otel_span = self._langfuse_client._otel_tracer.start_span( 1548 name=name, start_time=timestamp 1549 ) 1550 1551 return cast( 1552 "LangfuseEvent", 1553 LangfuseEvent( 1554 otel_span=new_otel_span, 1555 langfuse_client=self._langfuse_client, 1556 input=input, 1557 output=output, 1558 metadata=metadata, 1559 environment=self._environment, 1560 version=version, 1561 level=level, 1562 status_message=status_message, 1563 ).end(end_time=timestamp), 1564 )
Standard span implementation for general operations in Langfuse.
This class represents a general-purpose span that can be used to trace any operation in your application. It extends the base LangfuseObservationWrapper with specific methods for creating child spans, generations, and updating span-specific attributes. If possible, use a more specific type for better observability and insights.
1156 def __init__( 1157 self, 1158 *, 1159 otel_span: otel_trace_api.Span, 1160 langfuse_client: "Langfuse", 1161 input: Optional[Any] = None, 1162 output: Optional[Any] = None, 1163 metadata: Optional[Any] = None, 1164 environment: Optional[str] = None, 1165 version: Optional[str] = None, 1166 level: Optional[SpanLevel] = None, 1167 status_message: Optional[str] = None, 1168 ): 1169 """Initialize a new LangfuseSpan. 1170 1171 Args: 1172 otel_span: The OpenTelemetry span to wrap 1173 langfuse_client: Reference to the parent Langfuse client 1174 input: Input data for the span (any JSON-serializable object) 1175 output: Output data from the span (any JSON-serializable object) 1176 metadata: Additional metadata to associate with the span 1177 environment: The tracing environment 1178 version: Version identifier for the code or component 1179 level: Importance level of the span (info, warning, error) 1180 status_message: Optional status message for the span 1181 """ 1182 super().__init__( 1183 otel_span=otel_span, 1184 as_type="span", 1185 langfuse_client=langfuse_client, 1186 input=input, 1187 output=output, 1188 metadata=metadata, 1189 environment=environment, 1190 version=version, 1191 level=level, 1192 status_message=status_message, 1193 )
Initialize a new LangfuseSpan.
Arguments:
- otel_span: The OpenTelemetry span to wrap
- langfuse_client: Reference to the parent Langfuse client
- input: Input data for the span (any JSON-serializable object)
- output: Output data from the span (any JSON-serializable object)
- metadata: Additional metadata to associate with the span
- environment: The tracing environment
- version: Version identifier for the code or component
- level: Importance level of the span (info, warning, error)
- status_message: Optional status message for the span
1195 def start_span( 1196 self, 1197 name: str, 1198 input: Optional[Any] = None, 1199 output: Optional[Any] = None, 1200 metadata: Optional[Any] = None, 1201 version: Optional[str] = None, 1202 level: Optional[SpanLevel] = None, 1203 status_message: Optional[str] = None, 1204 ) -> "LangfuseSpan": 1205 """Create a new child span. 1206 1207 This method creates a new child span with this span as the parent. 1208 Unlike start_as_current_span(), this method does not set the new span 1209 as the current span in the context. 1210 1211 Args: 1212 name: Name of the span (e.g., function or operation name) 1213 input: Input data for the operation 1214 output: Output data from the operation 1215 metadata: Additional metadata to associate with the span 1216 version: Version identifier for the code or component 1217 level: Importance level of the span (info, warning, error) 1218 status_message: Optional status message for the span 1219 1220 Returns: 1221 A new LangfuseSpan that must be ended with .end() when complete 1222 1223 Example: 1224 ```python 1225 parent_span = langfuse.start_span(name="process-request") 1226 try: 1227 # Create a child span 1228 child_span = parent_span.start_span(name="validate-input") 1229 try: 1230 # Do validation work 1231 validation_result = validate(request_data) 1232 child_span.update(output=validation_result) 1233 finally: 1234 child_span.end() 1235 1236 # Continue with parent span 1237 result = process_validated_data(validation_result) 1238 parent_span.update(output=result) 1239 finally: 1240 parent_span.end() 1241 ``` 1242 """ 1243 return self.start_observation( 1244 name=name, 1245 as_type="span", 1246 input=input, 1247 output=output, 1248 metadata=metadata, 1249 version=version, 1250 level=level, 1251 status_message=status_message, 1252 )
Create a new child span.
This method creates a new child span with this span as the parent. Unlike start_as_current_span(), this method does not set the new span as the current span in the context.
Arguments:
- name: Name of the span (e.g., function or operation name)
- input: Input data for the operation
- output: Output data from the operation
- metadata: Additional metadata to associate with the span
- version: Version identifier for the code or component
- level: Importance level of the span (info, warning, error)
- status_message: Optional status message for the span
Returns:
A new LangfuseSpan that must be ended with .end() when complete
Example:
parent_span = langfuse.start_span(name="process-request") try: # Create a child span child_span = parent_span.start_span(name="validate-input") try: # Do validation work validation_result = validate(request_data) child_span.update(output=validation_result) finally: child_span.end() # Continue with parent span result = process_validated_data(validation_result) parent_span.update(output=result) finally: parent_span.end()
1254 def start_as_current_span( 1255 self, 1256 *, 1257 name: str, 1258 input: Optional[Any] = None, 1259 output: Optional[Any] = None, 1260 metadata: Optional[Any] = None, 1261 version: Optional[str] = None, 1262 level: Optional[SpanLevel] = None, 1263 status_message: Optional[str] = None, 1264 ) -> _AgnosticContextManager["LangfuseSpan"]: 1265 """[DEPRECATED] Create a new child span and set it as the current span in a context manager. 1266 1267 DEPRECATED: This method is deprecated and will be removed in a future version. 1268 Use start_as_current_observation(as_type='span') instead. 1269 1270 This method creates a new child span and sets it as the current span within 1271 a context manager. It should be used with a 'with' statement to automatically 1272 manage the span's lifecycle. 1273 1274 Args: 1275 name: Name of the span (e.g., function or operation name) 1276 input: Input data for the operation 1277 output: Output data from the operation 1278 metadata: Additional metadata to associate with the span 1279 version: Version identifier for the code or component 1280 level: Importance level of the span (info, warning, error) 1281 status_message: Optional status message for the span 1282 1283 Returns: 1284 A context manager that yields a new LangfuseSpan 1285 1286 Example: 1287 ```python 1288 with langfuse.start_as_current_span(name="process-request") as parent_span: 1289 # Parent span is active here 1290 1291 # Create a child span with context management 1292 with parent_span.start_as_current_span(name="validate-input") as child_span: 1293 # Child span is active here 1294 validation_result = validate(request_data) 1295 child_span.update(output=validation_result) 1296 1297 # Back to parent span context 1298 result = process_validated_data(validation_result) 1299 parent_span.update(output=result) 1300 ``` 1301 """ 1302 warnings.warn( 1303 "start_as_current_span is deprecated and will be removed in a future version. " 1304 "Use start_as_current_observation(as_type='span') instead.", 1305 DeprecationWarning, 1306 stacklevel=2, 1307 ) 1308 return self.start_as_current_observation( 1309 name=name, 1310 as_type="span", 1311 input=input, 1312 output=output, 1313 metadata=metadata, 1314 version=version, 1315 level=level, 1316 status_message=status_message, 1317 )
[DEPRECATED] Create a new child span and set it as the current span in a context manager.
DEPRECATED: This method is deprecated and will be removed in a future version. Use start_as_current_observation(as_type='span') instead.
This method creates a new child span and sets it as the current span within a context manager. It should be used with a 'with' statement to automatically manage the span's lifecycle.
Arguments:
- name: Name of the span (e.g., function or operation name)
- input: Input data for the operation
- output: Output data from the operation
- metadata: Additional metadata to associate with the span
- version: Version identifier for the code or component
- level: Importance level of the span (info, warning, error)
- status_message: Optional status message for the span
Returns:
A context manager that yields a new LangfuseSpan
Example:
with langfuse.start_as_current_span(name="process-request") as parent_span: # Parent span is active here # Create a child span with context management with parent_span.start_as_current_span(name="validate-input") as child_span: # Child span is active here validation_result = validate(request_data) child_span.update(output=validation_result) # Back to parent span context result = process_validated_data(validation_result) parent_span.update(output=result)
1319 def start_generation( 1320 self, 1321 *, 1322 name: str, 1323 input: Optional[Any] = None, 1324 output: Optional[Any] = None, 1325 metadata: Optional[Any] = None, 1326 version: Optional[str] = None, 1327 level: Optional[SpanLevel] = None, 1328 status_message: Optional[str] = None, 1329 completion_start_time: Optional[datetime] = None, 1330 model: Optional[str] = None, 1331 model_parameters: Optional[Dict[str, MapValue]] = None, 1332 usage_details: Optional[Dict[str, int]] = None, 1333 cost_details: Optional[Dict[str, float]] = None, 1334 prompt: Optional[PromptClient] = None, 1335 ) -> "LangfuseGeneration": 1336 """[DEPRECATED] Create a new child generation span. 1337 1338 DEPRECATED: This method is deprecated and will be removed in a future version. 1339 Use start_observation(as_type='generation') instead. 1340 1341 This method creates a new child generation span with this span as the parent. 1342 Generation spans are specialized for AI/LLM operations and include additional 1343 fields for model information, usage stats, and costs. 1344 1345 Unlike start_as_current_generation(), this method does not set the new span 1346 as the current span in the context. 1347 1348 Args: 1349 name: Name of the generation operation 1350 input: Input data for the model (e.g., prompts) 1351 output: Output from the model (e.g., completions) 1352 metadata: Additional metadata to associate with the generation 1353 version: Version identifier for the model or component 1354 level: Importance level of the generation (info, warning, error) 1355 status_message: Optional status message for the generation 1356 completion_start_time: When the model started generating the response 1357 model: Name/identifier of the AI model used (e.g., "gpt-4") 1358 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 1359 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 1360 cost_details: Cost information for the model call 1361 prompt: Associated prompt template from Langfuse prompt management 1362 1363 Returns: 1364 A new LangfuseGeneration that must be ended with .end() when complete 1365 1366 Example: 1367 ```python 1368 span = langfuse.start_span(name="process-query") 1369 try: 1370 # Create a generation child span 1371 generation = span.start_generation( 1372 name="generate-answer", 1373 model="gpt-4", 1374 input={"prompt": "Explain quantum computing"} 1375 ) 1376 try: 1377 # Call model API 1378 response = llm.generate(...) 1379 1380 generation.update( 1381 output=response.text, 1382 usage_details={ 1383 "prompt_tokens": response.usage.prompt_tokens, 1384 "completion_tokens": response.usage.completion_tokens 1385 } 1386 ) 1387 finally: 1388 generation.end() 1389 1390 # Continue with parent span 1391 span.update(output={"answer": response.text, "source": "gpt-4"}) 1392 finally: 1393 span.end() 1394 ``` 1395 """ 1396 warnings.warn( 1397 "start_generation is deprecated and will be removed in a future version. " 1398 "Use start_observation(as_type='generation') instead.", 1399 DeprecationWarning, 1400 stacklevel=2, 1401 ) 1402 return self.start_observation( 1403 name=name, 1404 as_type="generation", 1405 input=input, 1406 output=output, 1407 metadata=metadata, 1408 version=version, 1409 level=level, 1410 status_message=status_message, 1411 completion_start_time=completion_start_time, 1412 model=model, 1413 model_parameters=model_parameters, 1414 usage_details=usage_details, 1415 cost_details=cost_details, 1416 prompt=prompt, 1417 )
[DEPRECATED] Create a new child generation span.
DEPRECATED: This method is deprecated and will be removed in a future version. Use start_observation(as_type='generation') instead.
This method creates a new child generation span with this span as the parent. Generation spans are specialized for AI/LLM operations and include additional fields for model information, usage stats, and costs.
Unlike start_as_current_generation(), this method does not set the new span as the current span in the context.
Arguments:
- name: Name of the generation operation
- input: Input data for the model (e.g., prompts)
- output: Output from the model (e.g., completions)
- metadata: Additional metadata to associate with the generation
- version: Version identifier for the model or component
- level: Importance level of the generation (info, warning, error)
- status_message: Optional status message for the generation
- completion_start_time: When the model started generating the response
- model: Name/identifier of the AI model used (e.g., "gpt-4")
- model_parameters: Parameters used for the model (e.g., temperature, max_tokens)
- usage_details: Token usage information (e.g., prompt_tokens, completion_tokens)
- cost_details: Cost information for the model call
- prompt: Associated prompt template from Langfuse prompt management
Returns:
A new LangfuseGeneration that must be ended with .end() when complete
Example:
span = langfuse.start_span(name="process-query") try: # Create a generation child span generation = span.start_generation( name="generate-answer", model="gpt-4", input={"prompt": "Explain quantum computing"} ) try: # Call model API response = llm.generate(...) generation.update( output=response.text, usage_details={ "prompt_tokens": response.usage.prompt_tokens, "completion_tokens": response.usage.completion_tokens } ) finally: generation.end() # Continue with parent span span.update(output={"answer": response.text, "source": "gpt-4"}) finally: span.end()
1419 def start_as_current_generation( 1420 self, 1421 *, 1422 name: str, 1423 input: Optional[Any] = None, 1424 output: Optional[Any] = None, 1425 metadata: Optional[Any] = None, 1426 version: Optional[str] = None, 1427 level: Optional[SpanLevel] = None, 1428 status_message: Optional[str] = None, 1429 completion_start_time: Optional[datetime] = None, 1430 model: Optional[str] = None, 1431 model_parameters: Optional[Dict[str, MapValue]] = None, 1432 usage_details: Optional[Dict[str, int]] = None, 1433 cost_details: Optional[Dict[str, float]] = None, 1434 prompt: Optional[PromptClient] = None, 1435 ) -> _AgnosticContextManager["LangfuseGeneration"]: 1436 """[DEPRECATED] Create a new child generation span and set it as the current span in a context manager. 1437 1438 DEPRECATED: This method is deprecated and will be removed in a future version. 1439 Use start_as_current_observation(as_type='generation') instead. 1440 1441 This method creates a new child generation span and sets it as the current span 1442 within a context manager. Generation spans are specialized for AI/LLM operations 1443 and include additional fields for model information, usage stats, and costs. 1444 1445 Args: 1446 name: Name of the generation operation 1447 input: Input data for the model (e.g., prompts) 1448 output: Output from the model (e.g., completions) 1449 metadata: Additional metadata to associate with the generation 1450 version: Version identifier for the model or component 1451 level: Importance level of the generation (info, warning, error) 1452 status_message: Optional status message for the generation 1453 completion_start_time: When the model started generating the response 1454 model: Name/identifier of the AI model used (e.g., "gpt-4") 1455 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 1456 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 1457 cost_details: Cost information for the model call 1458 prompt: Associated prompt template from Langfuse prompt management 1459 1460 Returns: 1461 A context manager that yields a new LangfuseGeneration 1462 1463 Example: 1464 ```python 1465 with langfuse.start_as_current_span(name="process-request") as span: 1466 # Prepare data 1467 query = preprocess_user_query(user_input) 1468 1469 # Create a generation span with context management 1470 with span.start_as_current_generation( 1471 name="generate-answer", 1472 model="gpt-4", 1473 input={"query": query} 1474 ) as generation: 1475 # Generation span is active here 1476 response = llm.generate(query) 1477 1478 # Update with results 1479 generation.update( 1480 output=response.text, 1481 usage_details={ 1482 "prompt_tokens": response.usage.prompt_tokens, 1483 "completion_tokens": response.usage.completion_tokens 1484 } 1485 ) 1486 1487 # Back to parent span context 1488 span.update(output={"answer": response.text, "source": "gpt-4"}) 1489 ``` 1490 """ 1491 warnings.warn( 1492 "start_as_current_generation is deprecated and will be removed in a future version. " 1493 "Use start_as_current_observation(as_type='generation') instead.", 1494 DeprecationWarning, 1495 stacklevel=2, 1496 ) 1497 return self.start_as_current_observation( 1498 name=name, 1499 as_type="generation", 1500 input=input, 1501 output=output, 1502 metadata=metadata, 1503 version=version, 1504 level=level, 1505 status_message=status_message, 1506 completion_start_time=completion_start_time, 1507 model=model, 1508 model_parameters=model_parameters, 1509 usage_details=usage_details, 1510 cost_details=cost_details, 1511 prompt=prompt, 1512 )
[DEPRECATED] Create a new child generation span and set it as the current span in a context manager.
DEPRECATED: This method is deprecated and will be removed in a future version. Use start_as_current_observation(as_type='generation') instead.
This method creates a new child generation span and sets it as the current span within a context manager. Generation spans are specialized for AI/LLM operations and include additional fields for model information, usage stats, and costs.
Arguments:
- name: Name of the generation operation
- input: Input data for the model (e.g., prompts)
- output: Output from the model (e.g., completions)
- metadata: Additional metadata to associate with the generation
- version: Version identifier for the model or component
- level: Importance level of the generation (info, warning, error)
- status_message: Optional status message for the generation
- completion_start_time: When the model started generating the response
- model: Name/identifier of the AI model used (e.g., "gpt-4")
- model_parameters: Parameters used for the model (e.g., temperature, max_tokens)
- usage_details: Token usage information (e.g., prompt_tokens, completion_tokens)
- cost_details: Cost information for the model call
- prompt: Associated prompt template from Langfuse prompt management
Returns:
A context manager that yields a new LangfuseGeneration
Example:
with langfuse.start_as_current_span(name="process-request") as span: # Prepare data query = preprocess_user_query(user_input) # Create a generation span with context management with span.start_as_current_generation( name="generate-answer", model="gpt-4", input={"query": query} ) as generation: # Generation span is active here response = llm.generate(query) # Update with results generation.update( output=response.text, usage_details={ "prompt_tokens": response.usage.prompt_tokens, "completion_tokens": response.usage.completion_tokens } ) # Back to parent span context span.update(output={"answer": response.text, "source": "gpt-4"})
1514 def create_event( 1515 self, 1516 *, 1517 name: str, 1518 input: Optional[Any] = None, 1519 output: Optional[Any] = None, 1520 metadata: Optional[Any] = None, 1521 version: Optional[str] = None, 1522 level: Optional[SpanLevel] = None, 1523 status_message: Optional[str] = None, 1524 ) -> "LangfuseEvent": 1525 """Create a new Langfuse observation of type 'EVENT'. 1526 1527 Args: 1528 name: Name of the span (e.g., function or operation name) 1529 input: Input data for the operation (can be any JSON-serializable object) 1530 output: Output data from the operation (can be any JSON-serializable object) 1531 metadata: Additional metadata to associate with the span 1532 version: Version identifier for the code or component 1533 level: Importance level of the span (info, warning, error) 1534 status_message: Optional status message for the span 1535 1536 Returns: 1537 The LangfuseEvent object 1538 1539 Example: 1540 ```python 1541 event = langfuse.create_event(name="process-event") 1542 ``` 1543 """ 1544 timestamp = time_ns() 1545 1546 with otel_trace_api.use_span(self._otel_span): 1547 new_otel_span = self._langfuse_client._otel_tracer.start_span( 1548 name=name, start_time=timestamp 1549 ) 1550 1551 return cast( 1552 "LangfuseEvent", 1553 LangfuseEvent( 1554 otel_span=new_otel_span, 1555 langfuse_client=self._langfuse_client, 1556 input=input, 1557 output=output, 1558 metadata=metadata, 1559 environment=self._environment, 1560 version=version, 1561 level=level, 1562 status_message=status_message, 1563 ).end(end_time=timestamp), 1564 )
Create a new Langfuse observation of type 'EVENT'.
Arguments:
- name: Name of the span (e.g., function or operation name)
- input: Input data for the operation (can be any JSON-serializable object)
- output: Output data from the operation (can be any JSON-serializable object)
- metadata: Additional metadata to associate with the span
- version: Version identifier for the code or component
- level: Importance level of the span (info, warning, error)
- status_message: Optional status message for the span
Returns:
The LangfuseEvent object
Example:
event = langfuse.create_event(name="process-event")
1567class LangfuseGeneration(LangfuseObservationWrapper): 1568 """Specialized span implementation for AI model generations in Langfuse. 1569 1570 This class represents a generation span specifically designed for tracking 1571 AI/LLM operations. It extends the base LangfuseObservationWrapper with specialized 1572 attributes for model details, token usage, and costs. 1573 """ 1574 1575 def __init__( 1576 self, 1577 *, 1578 otel_span: otel_trace_api.Span, 1579 langfuse_client: "Langfuse", 1580 input: Optional[Any] = None, 1581 output: Optional[Any] = None, 1582 metadata: Optional[Any] = None, 1583 environment: Optional[str] = None, 1584 version: Optional[str] = None, 1585 level: Optional[SpanLevel] = None, 1586 status_message: Optional[str] = None, 1587 completion_start_time: Optional[datetime] = None, 1588 model: Optional[str] = None, 1589 model_parameters: Optional[Dict[str, MapValue]] = None, 1590 usage_details: Optional[Dict[str, int]] = None, 1591 cost_details: Optional[Dict[str, float]] = None, 1592 prompt: Optional[PromptClient] = None, 1593 ): 1594 """Initialize a new LangfuseGeneration span. 1595 1596 Args: 1597 otel_span: The OpenTelemetry span to wrap 1598 langfuse_client: Reference to the parent Langfuse client 1599 input: Input data for the generation (e.g., prompts) 1600 output: Output from the generation (e.g., completions) 1601 metadata: Additional metadata to associate with the generation 1602 environment: The tracing environment 1603 version: Version identifier for the model or component 1604 level: Importance level of the generation (info, warning, error) 1605 status_message: Optional status message for the generation 1606 completion_start_time: When the model started generating the response 1607 model: Name/identifier of the AI model used (e.g., "gpt-4") 1608 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 1609 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 1610 cost_details: Cost information for the model call 1611 prompt: Associated prompt template from Langfuse prompt management 1612 """ 1613 super().__init__( 1614 as_type="generation", 1615 otel_span=otel_span, 1616 langfuse_client=langfuse_client, 1617 input=input, 1618 output=output, 1619 metadata=metadata, 1620 environment=environment, 1621 version=version, 1622 level=level, 1623 status_message=status_message, 1624 completion_start_time=completion_start_time, 1625 model=model, 1626 model_parameters=model_parameters, 1627 usage_details=usage_details, 1628 cost_details=cost_details, 1629 prompt=prompt, 1630 )
Specialized span implementation for AI model generations in Langfuse.
This class represents a generation span specifically designed for tracking AI/LLM operations. It extends the base LangfuseObservationWrapper with specialized attributes for model details, token usage, and costs.
1575 def __init__( 1576 self, 1577 *, 1578 otel_span: otel_trace_api.Span, 1579 langfuse_client: "Langfuse", 1580 input: Optional[Any] = None, 1581 output: Optional[Any] = None, 1582 metadata: Optional[Any] = None, 1583 environment: Optional[str] = None, 1584 version: Optional[str] = None, 1585 level: Optional[SpanLevel] = None, 1586 status_message: Optional[str] = None, 1587 completion_start_time: Optional[datetime] = None, 1588 model: Optional[str] = None, 1589 model_parameters: Optional[Dict[str, MapValue]] = None, 1590 usage_details: Optional[Dict[str, int]] = None, 1591 cost_details: Optional[Dict[str, float]] = None, 1592 prompt: Optional[PromptClient] = None, 1593 ): 1594 """Initialize a new LangfuseGeneration span. 1595 1596 Args: 1597 otel_span: The OpenTelemetry span to wrap 1598 langfuse_client: Reference to the parent Langfuse client 1599 input: Input data for the generation (e.g., prompts) 1600 output: Output from the generation (e.g., completions) 1601 metadata: Additional metadata to associate with the generation 1602 environment: The tracing environment 1603 version: Version identifier for the model or component 1604 level: Importance level of the generation (info, warning, error) 1605 status_message: Optional status message for the generation 1606 completion_start_time: When the model started generating the response 1607 model: Name/identifier of the AI model used (e.g., "gpt-4") 1608 model_parameters: Parameters used for the model (e.g., temperature, max_tokens) 1609 usage_details: Token usage information (e.g., prompt_tokens, completion_tokens) 1610 cost_details: Cost information for the model call 1611 prompt: Associated prompt template from Langfuse prompt management 1612 """ 1613 super().__init__( 1614 as_type="generation", 1615 otel_span=otel_span, 1616 langfuse_client=langfuse_client, 1617 input=input, 1618 output=output, 1619 metadata=metadata, 1620 environment=environment, 1621 version=version, 1622 level=level, 1623 status_message=status_message, 1624 completion_start_time=completion_start_time, 1625 model=model, 1626 model_parameters=model_parameters, 1627 usage_details=usage_details, 1628 cost_details=cost_details, 1629 prompt=prompt, 1630 )
Initialize a new LangfuseGeneration span.
Arguments:
- otel_span: The OpenTelemetry span to wrap
- langfuse_client: Reference to the parent Langfuse client
- input: Input data for the generation (e.g., prompts)
- output: Output from the generation (e.g., completions)
- metadata: Additional metadata to associate with the generation
- environment: The tracing environment
- version: Version identifier for the model or component
- level: Importance level of the generation (info, warning, error)
- status_message: Optional status message for the generation
- completion_start_time: When the model started generating the response
- model: Name/identifier of the AI model used (e.g., "gpt-4")
- model_parameters: Parameters used for the model (e.g., temperature, max_tokens)
- usage_details: Token usage information (e.g., prompt_tokens, completion_tokens)
- cost_details: Cost information for the model call
- prompt: Associated prompt template from Langfuse prompt management
1633class LangfuseEvent(LangfuseObservationWrapper): 1634 """Specialized span implementation for Langfuse Events.""" 1635 1636 def __init__( 1637 self, 1638 *, 1639 otel_span: otel_trace_api.Span, 1640 langfuse_client: "Langfuse", 1641 input: Optional[Any] = None, 1642 output: Optional[Any] = None, 1643 metadata: Optional[Any] = None, 1644 environment: Optional[str] = None, 1645 version: Optional[str] = None, 1646 level: Optional[SpanLevel] = None, 1647 status_message: Optional[str] = None, 1648 ): 1649 """Initialize a new LangfuseEvent span. 1650 1651 Args: 1652 otel_span: The OpenTelemetry span to wrap 1653 langfuse_client: Reference to the parent Langfuse client 1654 input: Input data for the event 1655 output: Output from the event 1656 metadata: Additional metadata to associate with the generation 1657 environment: The tracing environment 1658 version: Version identifier for the model or component 1659 level: Importance level of the generation (info, warning, error) 1660 status_message: Optional status message for the generation 1661 """ 1662 super().__init__( 1663 otel_span=otel_span, 1664 as_type="event", 1665 langfuse_client=langfuse_client, 1666 input=input, 1667 output=output, 1668 metadata=metadata, 1669 environment=environment, 1670 version=version, 1671 level=level, 1672 status_message=status_message, 1673 ) 1674 1675 def update( 1676 self, 1677 *, 1678 name: Optional[str] = None, 1679 input: Optional[Any] = None, 1680 output: Optional[Any] = None, 1681 metadata: Optional[Any] = None, 1682 version: Optional[str] = None, 1683 level: Optional[SpanLevel] = None, 1684 status_message: Optional[str] = None, 1685 completion_start_time: Optional[datetime] = None, 1686 model: Optional[str] = None, 1687 model_parameters: Optional[Dict[str, MapValue]] = None, 1688 usage_details: Optional[Dict[str, int]] = None, 1689 cost_details: Optional[Dict[str, float]] = None, 1690 prompt: Optional[PromptClient] = None, 1691 **kwargs: Any, 1692 ) -> "LangfuseEvent": 1693 """Update is not allowed for LangfuseEvent because events cannot be updated. 1694 1695 This method logs a warning and returns self without making changes. 1696 1697 Returns: 1698 self: Returns the unchanged LangfuseEvent instance 1699 """ 1700 langfuse_logger.warning( 1701 "Attempted to update LangfuseEvent observation. Events cannot be updated after creation." 1702 ) 1703 return self
Specialized span implementation for Langfuse Events.
1636 def __init__( 1637 self, 1638 *, 1639 otel_span: otel_trace_api.Span, 1640 langfuse_client: "Langfuse", 1641 input: Optional[Any] = None, 1642 output: Optional[Any] = None, 1643 metadata: Optional[Any] = None, 1644 environment: Optional[str] = None, 1645 version: Optional[str] = None, 1646 level: Optional[SpanLevel] = None, 1647 status_message: Optional[str] = None, 1648 ): 1649 """Initialize a new LangfuseEvent span. 1650 1651 Args: 1652 otel_span: The OpenTelemetry span to wrap 1653 langfuse_client: Reference to the parent Langfuse client 1654 input: Input data for the event 1655 output: Output from the event 1656 metadata: Additional metadata to associate with the generation 1657 environment: The tracing environment 1658 version: Version identifier for the model or component 1659 level: Importance level of the generation (info, warning, error) 1660 status_message: Optional status message for the generation 1661 """ 1662 super().__init__( 1663 otel_span=otel_span, 1664 as_type="event", 1665 langfuse_client=langfuse_client, 1666 input=input, 1667 output=output, 1668 metadata=metadata, 1669 environment=environment, 1670 version=version, 1671 level=level, 1672 status_message=status_message, 1673 )
Initialize a new LangfuseEvent span.
Arguments:
- otel_span: The OpenTelemetry span to wrap
- langfuse_client: Reference to the parent Langfuse client
- input: Input data for the event
- output: Output from the event
- metadata: Additional metadata to associate with the generation
- environment: The tracing environment
- version: Version identifier for the model or component
- level: Importance level of the generation (info, warning, error)
- status_message: Optional status message for the generation
1675 def update( 1676 self, 1677 *, 1678 name: Optional[str] = None, 1679 input: Optional[Any] = None, 1680 output: Optional[Any] = None, 1681 metadata: Optional[Any] = None, 1682 version: Optional[str] = None, 1683 level: Optional[SpanLevel] = None, 1684 status_message: Optional[str] = None, 1685 completion_start_time: Optional[datetime] = None, 1686 model: Optional[str] = None, 1687 model_parameters: Optional[Dict[str, MapValue]] = None, 1688 usage_details: Optional[Dict[str, int]] = None, 1689 cost_details: Optional[Dict[str, float]] = None, 1690 prompt: Optional[PromptClient] = None, 1691 **kwargs: Any, 1692 ) -> "LangfuseEvent": 1693 """Update is not allowed for LangfuseEvent because events cannot be updated. 1694 1695 This method logs a warning and returns self without making changes. 1696 1697 Returns: 1698 self: Returns the unchanged LangfuseEvent instance 1699 """ 1700 langfuse_logger.warning( 1701 "Attempted to update LangfuseEvent observation. Events cannot be updated after creation." 1702 ) 1703 return self
Update is not allowed for LangfuseEvent because events cannot be updated.
This method logs a warning and returns self without making changes.
Returns:
self: Returns the unchanged LangfuseEvent instance
27class LangfuseOtelSpanAttributes: 28 # Langfuse-Trace attributes 29 TRACE_NAME = "langfuse.trace.name" 30 TRACE_USER_ID = "user.id" 31 TRACE_SESSION_ID = "session.id" 32 TRACE_TAGS = "langfuse.trace.tags" 33 TRACE_PUBLIC = "langfuse.trace.public" 34 TRACE_METADATA = "langfuse.trace.metadata" 35 TRACE_INPUT = "langfuse.trace.input" 36 TRACE_OUTPUT = "langfuse.trace.output" 37 38 # Langfuse-observation attributes 39 OBSERVATION_TYPE = "langfuse.observation.type" 40 OBSERVATION_METADATA = "langfuse.observation.metadata" 41 OBSERVATION_LEVEL = "langfuse.observation.level" 42 OBSERVATION_STATUS_MESSAGE = "langfuse.observation.status_message" 43 OBSERVATION_INPUT = "langfuse.observation.input" 44 OBSERVATION_OUTPUT = "langfuse.observation.output" 45 46 # Langfuse-observation of type Generation attributes 47 OBSERVATION_COMPLETION_START_TIME = "langfuse.observation.completion_start_time" 48 OBSERVATION_MODEL = "langfuse.observation.model.name" 49 OBSERVATION_MODEL_PARAMETERS = "langfuse.observation.model.parameters" 50 OBSERVATION_USAGE_DETAILS = "langfuse.observation.usage_details" 51 OBSERVATION_COST_DETAILS = "langfuse.observation.cost_details" 52 OBSERVATION_PROMPT_NAME = "langfuse.observation.prompt.name" 53 OBSERVATION_PROMPT_VERSION = "langfuse.observation.prompt.version" 54 55 # General 56 ENVIRONMENT = "langfuse.environment" 57 RELEASE = "langfuse.release" 58 VERSION = "langfuse.version" 59 60 # Internal 61 AS_ROOT = "langfuse.internal.as_root" 62 63 # Experiments 64 EXPERIMENT_ID = "langfuse.experiment.id" 65 EXPERIMENT_NAME = "langfuse.experiment.name" 66 EXPERIMENT_DESCRIPTION = "langfuse.experiment.description" 67 EXPERIMENT_METADATA = "langfuse.experiment.metadata" 68 EXPERIMENT_DATASET_ID = "langfuse.experiment.dataset.id" 69 EXPERIMENT_ITEM_ID = "langfuse.experiment.item.id" 70 EXPERIMENT_ITEM_EXPECTED_OUTPUT = "langfuse.experiment.item.expected_output" 71 EXPERIMENT_ITEM_METADATA = "langfuse.experiment.item.metadata" 72 EXPERIMENT_ITEM_ROOT_OBSERVATION_ID = "langfuse.experiment.item.root_observation_id"
1706class LangfuseAgent(LangfuseObservationWrapper): 1707 """Agent observation for reasoning blocks that act on tools using LLM guidance.""" 1708 1709 def __init__(self, **kwargs: Any) -> None: 1710 """Initialize a new LangfuseAgent span.""" 1711 kwargs["as_type"] = "agent" 1712 super().__init__(**kwargs)
Agent observation for reasoning blocks that act on tools using LLM guidance.
1715class LangfuseTool(LangfuseObservationWrapper): 1716 """Tool observation representing external tool calls, e.g., calling a weather API.""" 1717 1718 def __init__(self, **kwargs: Any) -> None: 1719 """Initialize a new LangfuseTool span.""" 1720 kwargs["as_type"] = "tool" 1721 super().__init__(**kwargs)
Tool observation representing external tool calls, e.g., calling a weather API.
1724class LangfuseChain(LangfuseObservationWrapper): 1725 """Chain observation for connecting LLM application steps, e.g. passing context from retriever to LLM.""" 1726 1727 def __init__(self, **kwargs: Any) -> None: 1728 """Initialize a new LangfuseChain span.""" 1729 kwargs["as_type"] = "chain" 1730 super().__init__(**kwargs)
Chain observation for connecting LLM application steps, e.g. passing context from retriever to LLM.
1742class LangfuseEmbedding(LangfuseObservationWrapper): 1743 """Embedding observation for LLM embedding calls, typically used before retrieval.""" 1744 1745 def __init__(self, **kwargs: Any) -> None: 1746 """Initialize a new LangfuseEmbedding span.""" 1747 kwargs["as_type"] = "embedding" 1748 super().__init__(**kwargs)
Embedding observation for LLM embedding calls, typically used before retrieval.
1751class LangfuseEvaluator(LangfuseObservationWrapper): 1752 """Evaluator observation for assessing relevance, correctness, or helpfulness of LLM outputs.""" 1753 1754 def __init__(self, **kwargs: Any) -> None: 1755 """Initialize a new LangfuseEvaluator span.""" 1756 kwargs["as_type"] = "evaluator" 1757 super().__init__(**kwargs)
Evaluator observation for assessing relevance, correctness, or helpfulness of LLM outputs.
1733class LangfuseRetriever(LangfuseObservationWrapper): 1734 """Retriever observation for data retrieval steps, e.g. vector store or database queries.""" 1735 1736 def __init__(self, **kwargs: Any) -> None: 1737 """Initialize a new LangfuseRetriever span.""" 1738 kwargs["as_type"] = "retriever" 1739 super().__init__(**kwargs)
Retriever observation for data retrieval steps, e.g. vector store or database queries.
1760class LangfuseGuardrail(LangfuseObservationWrapper): 1761 """Guardrail observation for protection e.g. against jailbreaks or offensive content.""" 1762 1763 def __init__(self, **kwargs: Any) -> None: 1764 """Initialize a new LangfuseGuardrail span.""" 1765 kwargs["as_type"] = "guardrail" 1766 super().__init__(**kwargs)
Guardrail observation for protection e.g. against jailbreaks or offensive content.
97class Evaluation: 98 """Represents an evaluation result for an experiment item or an entire experiment run. 99 100 This class provides a strongly-typed way to create evaluation results in evaluator functions. 101 Users must use keyword arguments when instantiating this class. 102 103 Attributes: 104 name: Unique identifier for the evaluation metric. Should be descriptive 105 and consistent across runs (e.g., "accuracy", "bleu_score", "toxicity"). 106 Used for aggregation and comparison across experiment runs. 107 value: The evaluation score or result. Can be: 108 - Numeric (int/float): For quantitative metrics like accuracy (0.85), BLEU (0.42) 109 - String: For categorical results like "positive", "negative", "neutral" 110 - Boolean: For binary assessments like "passes_safety_check" 111 - None: When evaluation cannot be computed (missing data, API errors, etc.) 112 comment: Optional human-readable explanation of the evaluation result. 113 Useful for providing context, explaining scoring rationale, or noting 114 special conditions. Displayed in Langfuse UI for interpretability. 115 metadata: Optional structured metadata about the evaluation process. 116 Can include confidence scores, intermediate calculations, model versions, 117 or any other relevant technical details. 118 data_type: Optional score data type. Required if value is not NUMERIC. 119 One of NUMERIC, CATEGORICAL, or BOOLEAN. Defaults to NUMERIC. 120 config_id: Optional Langfuse score config ID. 121 122 Examples: 123 Basic accuracy evaluation: 124 ```python 125 from langfuse import Evaluation 126 127 def accuracy_evaluator(*, input, output, expected_output=None, **kwargs): 128 if not expected_output: 129 return Evaluation(name="accuracy", value=None, comment="No expected output") 130 131 is_correct = output.strip().lower() == expected_output.strip().lower() 132 return Evaluation( 133 name="accuracy", 134 value=1.0 if is_correct else 0.0, 135 comment="Correct answer" if is_correct else "Incorrect answer" 136 ) 137 ``` 138 139 Multi-metric evaluator: 140 ```python 141 def comprehensive_evaluator(*, input, output, expected_output=None, **kwargs): 142 return [ 143 Evaluation(name="length", value=len(output), comment=f"Output length: {len(output)} chars"), 144 Evaluation(name="has_greeting", value="hello" in output.lower(), comment="Contains greeting"), 145 Evaluation( 146 name="quality", 147 value=0.85, 148 comment="High quality response", 149 metadata={"confidence": 0.92, "model": "gpt-4"} 150 ) 151 ] 152 ``` 153 154 Categorical evaluation: 155 ```python 156 def sentiment_evaluator(*, input, output, **kwargs): 157 sentiment = analyze_sentiment(output) # Returns "positive", "negative", or "neutral" 158 return Evaluation( 159 name="sentiment", 160 value=sentiment, 161 comment=f"Response expresses {sentiment} sentiment", 162 data_type="CATEGORICAL" 163 ) 164 ``` 165 166 Failed evaluation with error handling: 167 ```python 168 def external_api_evaluator(*, input, output, **kwargs): 169 try: 170 score = external_api.evaluate(output) 171 return Evaluation(name="external_score", value=score) 172 except Exception as e: 173 return Evaluation( 174 name="external_score", 175 value=None, 176 comment=f"API unavailable: {e}", 177 metadata={"error": str(e), "retry_count": 3} 178 ) 179 ``` 180 181 Note: 182 All arguments must be passed as keywords. Positional arguments are not allowed 183 to ensure code clarity and prevent errors from argument reordering. 184 """ 185 186 def __init__( 187 self, 188 *, 189 name: str, 190 value: Union[int, float, str, bool, None], 191 comment: Optional[str] = None, 192 metadata: Optional[Dict[str, Any]] = None, 193 data_type: Optional[ScoreDataType] = None, 194 config_id: Optional[str] = None, 195 ): 196 """Initialize an Evaluation with the provided data. 197 198 Args: 199 name: Unique identifier for the evaluation metric. 200 value: The evaluation score or result. 201 comment: Optional human-readable explanation of the result. 202 metadata: Optional structured metadata about the evaluation process. 203 data_type: Optional score data type (NUMERIC, CATEGORICAL, or BOOLEAN). 204 config_id: Optional Langfuse score config ID. 205 206 Note: 207 All arguments must be provided as keywords. Positional arguments will raise a TypeError. 208 """ 209 self.name = name 210 self.value = value 211 self.comment = comment 212 self.metadata = metadata 213 self.data_type = data_type 214 self.config_id = config_id
Represents an evaluation result for an experiment item or an entire experiment run.
This class provides a strongly-typed way to create evaluation results in evaluator functions. Users must use keyword arguments when instantiating this class.
Attributes:
- name: Unique identifier for the evaluation metric. Should be descriptive and consistent across runs (e.g., "accuracy", "bleu_score", "toxicity"). Used for aggregation and comparison across experiment runs.
- value: The evaluation score or result. Can be:
- Numeric (int/float): For quantitative metrics like accuracy (0.85), BLEU (0.42)
- String: For categorical results like "positive", "negative", "neutral"
- Boolean: For binary assessments like "passes_safety_check"
- None: When evaluation cannot be computed (missing data, API errors, etc.)
- comment: Optional human-readable explanation of the evaluation result. Useful for providing context, explaining scoring rationale, or noting special conditions. Displayed in Langfuse UI for interpretability.
- metadata: Optional structured metadata about the evaluation process. Can include confidence scores, intermediate calculations, model versions, or any other relevant technical details.
- data_type: Optional score data type. Required if value is not NUMERIC. One of NUMERIC, CATEGORICAL, or BOOLEAN. Defaults to NUMERIC.
- config_id: Optional Langfuse score config ID.
Examples:
Basic accuracy evaluation:
from langfuse import Evaluation def accuracy_evaluator(*, input, output, expected_output=None, **kwargs): if not expected_output: return Evaluation(name="accuracy", value=None, comment="No expected output") is_correct = output.strip().lower() == expected_output.strip().lower() return Evaluation( name="accuracy", value=1.0 if is_correct else 0.0, comment="Correct answer" if is_correct else "Incorrect answer" )Multi-metric evaluator:
def comprehensive_evaluator(*, input, output, expected_output=None, **kwargs): return [ Evaluation(name="length", value=len(output), comment=f"Output length: {len(output)} chars"), Evaluation(name="has_greeting", value="hello" in output.lower(), comment="Contains greeting"), Evaluation( name="quality", value=0.85, comment="High quality response", metadata={"confidence": 0.92, "model": "gpt-4"} ) ]Categorical evaluation:
def sentiment_evaluator(*, input, output, **kwargs): sentiment = analyze_sentiment(output) # Returns "positive", "negative", or "neutral" return Evaluation( name="sentiment", value=sentiment, comment=f"Response expresses {sentiment} sentiment", data_type="CATEGORICAL" )Failed evaluation with error handling:
def external_api_evaluator(*, input, output, **kwargs): try: score = external_api.evaluate(output) return Evaluation(name="external_score", value=score) except Exception as e: return Evaluation( name="external_score", value=None, comment=f"API unavailable: {e}", metadata={"error": str(e), "retry_count": 3} )
Note:
All arguments must be passed as keywords. Positional arguments are not allowed to ensure code clarity and prevent errors from argument reordering.
186 def __init__( 187 self, 188 *, 189 name: str, 190 value: Union[int, float, str, bool, None], 191 comment: Optional[str] = None, 192 metadata: Optional[Dict[str, Any]] = None, 193 data_type: Optional[ScoreDataType] = None, 194 config_id: Optional[str] = None, 195 ): 196 """Initialize an Evaluation with the provided data. 197 198 Args: 199 name: Unique identifier for the evaluation metric. 200 value: The evaluation score or result. 201 comment: Optional human-readable explanation of the result. 202 metadata: Optional structured metadata about the evaluation process. 203 data_type: Optional score data type (NUMERIC, CATEGORICAL, or BOOLEAN). 204 config_id: Optional Langfuse score config ID. 205 206 Note: 207 All arguments must be provided as keywords. Positional arguments will raise a TypeError. 208 """ 209 self.name = name 210 self.value = value 211 self.comment = comment 212 self.metadata = metadata 213 self.data_type = data_type 214 self.config_id = config_id
Initialize an Evaluation with the provided data.
Arguments:
- name: Unique identifier for the evaluation metric.
- value: The evaluation score or result.
- comment: Optional human-readable explanation of the result.
- metadata: Optional structured metadata about the evaluation process.
- data_type: Optional score data type (NUMERIC, CATEGORICAL, or BOOLEAN).
- config_id: Optional Langfuse score config ID.
Note:
All arguments must be provided as keywords. Positional arguments will raise a TypeError.