I Had GPT4 Develop a Monitoring Function for the OriginBot
This blog was first published on Guyuehome. If you wish to reprint it, please indicate the source: Guyuehome http://www.guyuehome.com/45164
As I mentioned in my previous blog, I use ChatGPT (GPT4 - 32k) extensively in my work for assistance, such as in solution design, function writing, code review, writing test cases, and debugging. Although it can’t always be completely correct, as an auxiliary and reference tool, it is still very useful.
When developing the monitoring function for the OriginBot this time, I wanted to try using GPT4 to develop this function from start to finish. I had never asked GPT4 to complete a full - fledged task before because it is limited by the length of the context and has difficulty in handling large - scale tasks. This was an experiment, but judging from the results, it was quite successful.
I used the GPT4 - 32k model, which is much more powerful than GPT3.5. I think this is an important reason for the good results.
To make it easier for you to understand the following content, I will explain the structure of the article here.
The content after “Message From You” is my Prompt, that is, the question. The content after “Message From ChatGPT” is ChatGPT’s answer.
I will also add my explanations in the form of quotations in some places. These contents are not part of my conversation with ChatGPT.
So, if you want a brief understanding of this blog, you can just read the quoted parts.
The following is the main text, an extremely long one with 19555 characters
Message From You:
Now you are an expert in robot development, familiar with various robot - related technologies such as ros2, SLAM, nav2, motion control and planning. You are also familiar with embedded development like stm32, and various machine - learning skills. I will assign you some tasks. Please think carefully and complete these tasks.
Message From ChatGPT:
Sure, I’d be happy to help you solve problems. Please clearly tell me the tasks you need to complete and the problems you encounter, and I will do my best to provide the best solutions or suggestions.
In this conversation, I assigned ChatGPT the role of a robot development expert to activate domain knowledge. From my experience, doing this can improve the effectiveness of GPT’s answers to some extent.
Message From You:
The OriginBot is a two - wheeled robot equipped with a MIPI camera, LiDAR, and IMU. The code is developed based on ros2 foxy. Now, I tell you that after the OriginBot’s camera is enabled, it will publish image data through a topic called image_raw. Now, on another server under the same WiFi, I want to use VUE3 + Django + DRF to display continuous video in real - time, serving as a remote monitoring function. Please think carefully and break down this task, and describe the steps in detail.
Message From ChatGPT:
To complete this task, we can roughly break it down into the following steps:
Receive Camera Data from the ROS 2 Node
- You need to create a new ROS 2 node to subscribe to the
image_raw
topic. When image data is published, this node can receive it. - This node may need to use some ROS and computer - vision libraries (such as cv_bridge and OpenCV) to convert the ROS Image message into a format suitable for network transmission (such as JPEG or PNG).
- You need to create a new ROS 2 node to subscribe to the
Establish Communication between the Server and the ROS Node
- After the ROS node successfully acquires and processes the image data, the next step is to set up a listening service on another server in the same WiFi environment. You can use WebSocket to achieve this function because it supports real - time, full - duplex, and point - to - point connections.
- In your ROS node, you need to add WebSocket client code to send the processed image data to the server.
Set up the Django + DRF Server
- On the server, create a Django project and use the Django REST framework (DRF) to build an API interface.
- Start a WebSocket server. In the Django project, use a third - party library such as
channels
to create a WebSocket route to receive the image data sent from the ROS node.
Store and Stream the Video
- Usually, consecutive frames are organized into a video sequence. So after you have obtained each individual frame, you need to consider how to organize these frames and play them through a web page.
- You can consider using a streaming protocol such as MPEG - DASH or HLS. I’m not sure if Django has such built - in functions, but there should be corresponding libraries available.
VUE3 Front - end Display
- Develop the front - end page using Vue.js and subscribe to the image stream captured by the camera through WebSocket.
- Display the received images as a video. The HTML5 video player or other third - party video.js library can be used to meet this requirement.
The above is the overall process. In the actual implementation process, there may be some flexible adjustments and detailed designs, such as error handling and thread management.
The framework and steps of GPT’s overall solution, as well as a brief introduction to each step. The solution it provided may work, but from my limited experience in robot development, it’s not very good. So I made some modifications to this framework later.
It can be seen that we can’t completely rely on ChatGPT’s answers and must make our own judgments.
Message From You:
Now please give a detailed description, steps, commands, and code for the first step, “Receive Camera Data from the ROS 2 Node”. Remember to consider the relevance to all other steps at the same time.
Message From ChatGPT:
The specific steps to receive camera data from the ROS 2 node are as follows:
Environment Preparation
Ensure that you have installed ros2 foxy and other necessary dependent libraries, such as cv_bridge and OpenCV.Create a ROS 2 Package
If you haven’t created one yet, you need to create a new ROS 2 package first. Suppose we name itcamera_listener
. The following are the relevant commands to create this package in the terminal:1
2
3
4source /opt/ros/foxy/setup.bash # Configure ros2 environment variables
mkdir -p ~/robot_ws/src # Create the workspace and src directory. If it already exists, you can ignore this step.
cd ~/robot_ws/src
ros2 pkg create --build - type ament_python camera_listener # Create a Python - type ros2 packageSet up a ROS 2 Node and Subscribe to the Topic
In the
camera_listener
package, create a Python file (for example,listener.py
) to implement the ROS node for image acquisition:
1 | import rclpy |
This code first imports the necessary modules, then defines a `CameraListener` class that inherits from `Node` and subscribes to the `image_raw` topic. Whenever a message is published to this topic, the `listener_callback` method will be triggered.
Note that the `imgmsg_to_cv2` function here is to convert the ROS image message to the OpenCV format.
Run the Node
Then you can run the newly created node in the following way:
1
2
3
4cd ~/robot_ws/
colcon build --packages - select camera_listener # Compile the workspace
source install/setup.bash # Set the environment variables
ros2 run camera_listener listener.py # Run the node
The above are the steps to receive camera data. We need to ensure that when the OriginBot publishes image data, the newly created node can run normally and receive these data.
In this code, we also need to consider subsequent interactions with other systems. The listener_callback
function will obtain the Image message from ROS and convert it to the cv2 image format. Inside this function, you can add operations such as sending the image to the server (using a Python - related WebSocket client library) or persisting it.
Message From You:
Please improve the listener_callback
function. Use the paho
module in it to publish data to the MQTT server. Ensure that after I run the node, data can be continuously sent to MQTT.
Message From ChatGPT:
First, you need to install the Paho MQTT library. Run the following command in your Python environment:
1 | pip install paho - mqtt |
Then, here is an example of using the paho mqtt client to publish image data to the MQTT server:
For ROS, we will convert the obtained image to JPEG format and encode it (because MQTT does not support directly sending binary data), and then send it through MQTT.
Modify the listener.py
file:
1 | import cv2 |
This code processes the Image message obtained from ROS and converts it into a JPEG string to send to the MQTT server. Note that ‘localhost’ and 1883 should be replaced with the actual MQTT server address and port number.
To test this program locally, make sure you have installed a MQTT broker such as Mosquitto that provides MQTT services and it is running.
To start the node, you can compile and build the ROS workspace as provided before, and then use the ros2 run
command to execute the node. Enter the following command in the terminal to see the images captured by the camera being continuously sent to the MQTT server:
1 | cd ~/robot_ws/ |
Message From You:
Before compilation, you should also improve the entry_points
in setup.py
of the camera_listener
node.
Currently, it is like this. Please improve it.
entry_points={
‘console_scripts’: [
],
},
Message From ChatGPT:
In the setup.py
file, we need to add the entry_points
field so that ROS2 can find and run our Python script. Here is how to set listener.py
as an executable command:
1 | entry_points={ |
This code tells ROS2 to look for the main
function in the listener
module of the camera_listener
package as the entry point for the camera_listener
command.
Remember, you also need to ensure that the corresponding dependencies are defined in package.xml
and there are corresponding installation rules in CMakeLists.txt
or setup.cfg
(depending on whether you use ament_cmake
or ament_python
). If you are using Python and ament_python
, setup.py
should be sufficient. The templates created by ros2 pkg create
have already helped handle all these steps.
In the above conversations, I actually asked ChatGPT to develop a ros2 package to subscribe to image data and send it to the MQTT Server.
If you look closely at the conversation content, you will find that ChatGPT’s answers were initially incorrect. It was only after my continuous adjustments that a relatively good result was obtained.
For example, the
listener_callback
function it gave for the first time was just a shell with no actual content; I specified that it should usepaho - mqtt
to handle MQTT - related functions; and finally, I reminded it to improve the relevant configuration insetup.py
of the package.
Message From You:
To complete this task, we can roughly break it down into the following steps:
Receive Camera Data from the ROS 2 Node
- You need to create a new ROS 2 node to subscribe to the
image_raw
topic. When image data is published, this node can receive it. - This node may need to use some ROS and computer - vision libraries (such as cv_bridge and OpenCV) to convert the ROS Image message into a format suitable for network transmission (such as JPEG or PNG).
- You need to create a new ROS 2 node to subscribe to the
Establish Communication between the Server and the ROS Node
- After the ROS node successfully acquires and processes the image data, the next step is to set up a listening service on another server in the same WiFi environment. You can use WebSocket to achieve this function because it supports real - time, full - duplex, and point - to - point connections.
- In your ROS node, you need to add WebSocket client code to send the processed image data to the server.
Set up the Django + DRF Server
- On the server, create a Django project and use the Django REST framework (DRF) to build an API interface.
- Start a WebSocket server. In the Django project, use a third - party library such as
channels
to create a WebSocket route to receive the image data sent from the ROS node.
Store and Stream the Video
- Usually, consecutive frames are organized into a video sequence. So after you have obtained each individual frame, you need to consider how to organize these frames and play them through a web page.
- You can consider using a streaming protocol such as MPEG - DASH or HLS. I’m not sure if Django has such built - in functions, but there should be corresponding libraries available.
VUE3 Front - end Display
- Develop the front - end page using Vue.js and subscribe to the image stream captured by the camera through WebSocket.
- Display the received images as a video. The HTML5 video player or other third - party video.js library can be used to meet this requirement.
This is the suggestion you gave me. You have completed the first step. Now please complete the second step: Establish communication between the server and the ROS node. Give a detailed description, steps, commands, and code. Remember to consider the relevance to all other steps at the same time.
In addition, in the first step, we adopted the method of sending data out through MQTT. So you should pay attention to this part when designing.
Here I repeated the solution ChatGPT gave me to help it recall. According to my experience, doing this can alleviate the problem of forgetting caused by the context length limit.
Message From ChatGPT:
Okay, it seems you have a