SceneXplain
SceneXplain is an ImageCaptioning service accessible through the SceneXplain Tool.
To use this tool, you'll need to make an account and fetch your API Token from the website. Then you can instantiate the tool.
import os
os.environ["SCENEX_API_KEY"] = ""
from langchain_community.tools import SceneXplainTool
tool = SceneXplainTool()
tool.run(
"https://storage.googleapis.com/causal-diffusion.appspot.com/imagePrompts%2F0rw369i5h9t%2Foriginal.png"
)
API Reference:SceneXplainTool
"The image portrays a serene, rain-soaked scene from the animated film *My Neighbor Totoro*. A young girl, Satsuki, stands to the left of a massive gray creature named Totoro. Satsuki is holding a pink umbrella and dressed in bright clothing, while Totoro holds a large dark-colored umbrella. They are sharing this quiet moment together in the rain, which cascades gently around them, creating a tranquil and heartwarming atmosphere. In the background, there is a sidewalk and a signpost, hinting at the iconic setting of the bus stop from the film. Totoro's large, rounded form, tall ears, and calm demeanor make him an endearing figure beside the smaller Satsuki. Their companionship captures the magic and emotional depth of the film."
Usage in an Agent
The tool can be used in any LangChain agent as follows:
import os
os.environ["OPENAI_API_KEY"] = ""
os.environ["SCENEX_API_KEY"] = ""
from langchain_community.agent_toolkits.load_tools import load_tools
tools = load_tools(["sceneXplain"])
API Reference:load_tools
from langgraph.prebuilt import create_react_agent
agent = create_react_agent("openai:gpt-4.1-mini", tools)
API Reference:create_react_agent
input_message = {
"role": "user",
"content": (
"What is in this image https://storage.googleapis.com/causal-diffusion.appspot.com/imagePrompts%2F0rw369i5h9t%2Foriginal.png"
"Is it movie or a game? If it is a movie, what is the name of the movie? Answer the question based on the image"
),
}
for step in agent.stream(
{"messages": [input_message]},
stream_mode="values",
):
step["messages"][-1].pretty_print()
================================[1m Human Message [0m=================================
What is in this image https://storage.googleapis.com/causal-diffusion.appspot.com/imagePrompts%2F0rw369i5h9t%2Foriginal.pngIs it movie or a game? If it is a movie, what is the name of the movie? Answer the question based on the image
==================================[1m Ai Message [0m==================================
Tool Calls:
image_explainer (call_eg4cvhJAiUHrZPrWg5G7ScBS)
Call ID: call_eg4cvhJAiUHrZPrWg5G7ScBS
Args:
query: https://storage.googleapis.com/causal-diffusion.appspot.com/imagePrompts%2F0rw369i5h9t%2Foriginal.png
=================================[1m Tool Message [0m=================================
Name: image_explainer
The image depicts a scene from the animated film "My Neighbor Totoro." Totoro, a large grey-brown creature resembling a cat-bear, is standing in the rain beside a young girl. Both characters are holding umbrellas: Totoro has a black one, and the girl holds a pink one. They are positioned near a bus stop on a rainy day, with lush green woods and a dark forest visible in the background. The scene conveys an emotional and warm atmosphere, highlighting the bond between the characters.
==================================[1m Ai Message [0m==================================
The image is from the animated movie "My Neighbor Totoro." It shows Totoro, a large grey-brown creature, standing in the rain beside a young girl who is holding an umbrella at a bus stop. This is not a game, but a scene from this well-known animated film.
Related
- Tool conceptual guide
- Tool how-to guides