I Had GPT4 Develop a Monitoring Function for the OriginBot

This blog was first published on Guyuehome. If you wish to reprint it, please indicate the source: Guyuehome http://www.guyuehome.com/45164

As I mentioned in my previous blog, I use ChatGPT (GPT4 - 32k) extensively in my work for assistance, such as in solution design, function writing, code review, writing test cases, and debugging. Although it can’t always be completely correct, as an auxiliary and reference tool, it is still very useful.

When developing the monitoring function for the OriginBot this time, I wanted to try using GPT4 to develop this function from start to finish. I had never asked GPT4 to complete a full - fledged task before because it is limited by the length of the context and has difficulty in handling large - scale tasks. This was an experiment, but judging from the results, it was quite successful.

I used the GPT4 - 32k model, which is much more powerful than GPT3.5. I think this is an important reason for the good results.

To make it easier for you to understand the following content, I will explain the structure of the article here.
The content after “Message From You” is my Prompt, that is, the question. The content after “Message From ChatGPT” is ChatGPT’s answer.
I will also add my explanations in the form of quotations in some places. These contents are not part of my conversation with ChatGPT.
So, if you want a brief understanding of this blog, you can just read the quoted parts.

The following is the main text, an extremely long one with 19555 characters


Message From You:

Now you are an expert in robot development, familiar with various robot - related technologies such as ros2, SLAM, nav2, motion control and planning. You are also familiar with embedded development like stm32, and various machine - learning skills. I will assign you some tasks. Please think carefully and complete these tasks.

Message From ChatGPT:

Sure, I’d be happy to help you solve problems. Please clearly tell me the tasks you need to complete and the problems you encounter, and I will do my best to provide the best solutions or suggestions.

In this conversation, I assigned ChatGPT the role of a robot development expert to activate domain knowledge. From my experience, doing this can improve the effectiveness of GPT’s answers to some extent.

Message From You:

The OriginBot is a two - wheeled robot equipped with a MIPI camera, LiDAR, and IMU. The code is developed based on ros2 foxy. Now, I tell you that after the OriginBot’s camera is enabled, it will publish image data through a topic called image_raw. Now, on another server under the same WiFi, I want to use VUE3 + Django + DRF to display continuous video in real - time, serving as a remote monitoring function. Please think carefully and break down this task, and describe the steps in detail.

Message From ChatGPT:

To complete this task, we can roughly break it down into the following steps:

  1. Receive Camera Data from the ROS 2 Node

    • You need to create a new ROS 2 node to subscribe to the image_raw topic. When image data is published, this node can receive it.
    • This node may need to use some ROS and computer - vision libraries (such as cv_bridge and OpenCV) to convert the ROS Image message into a format suitable for network transmission (such as JPEG or PNG).
  2. Establish Communication between the Server and the ROS Node

    • After the ROS node successfully acquires and processes the image data, the next step is to set up a listening service on another server in the same WiFi environment. You can use WebSocket to achieve this function because it supports real - time, full - duplex, and point - to - point connections.
    • In your ROS node, you need to add WebSocket client code to send the processed image data to the server.
  3. Set up the Django + DRF Server

    • On the server, create a Django project and use the Django REST framework (DRF) to build an API interface.
    • Start a WebSocket server. In the Django project, use a third - party library such as channels to create a WebSocket route to receive the image data sent from the ROS node.
  4. Store and Stream the Video

    • Usually, consecutive frames are organized into a video sequence. So after you have obtained each individual frame, you need to consider how to organize these frames and play them through a web page.
    • You can consider using a streaming protocol such as MPEG - DASH or HLS. I’m not sure if Django has such built - in functions, but there should be corresponding libraries available.
  5. VUE3 Front - end Display

    • Develop the front - end page using Vue.js and subscribe to the image stream captured by the camera through WebSocket.
    • Display the received images as a video. The HTML5 video player or other third - party video.js library can be used to meet this requirement.

The above is the overall process. In the actual implementation process, there may be some flexible adjustments and detailed designs, such as error handling and thread management.

The framework and steps of GPT’s overall solution, as well as a brief introduction to each step. The solution it provided may work, but from my limited experience in robot development, it’s not very good. So I made some modifications to this framework later.

It can be seen that we can’t completely rely on ChatGPT’s answers and must make our own judgments.

Message From You:

Now please give a detailed description, steps, commands, and code for the first step, “Receive Camera Data from the ROS 2 Node”. Remember to consider the relevance to all other steps at the same time.

Message From ChatGPT:

The specific steps to receive camera data from the ROS 2 node are as follows:

  1. Environment Preparation
    Ensure that you have installed ros2 foxy and other necessary dependent libraries, such as cv_bridge and OpenCV.

  2. Create a ROS 2 Package
    If you haven’t created one yet, you need to create a new ROS 2 package first. Suppose we name it camera_listener. The following are the relevant commands to create this package in the terminal:

    1
    2
    3
    4
    source /opt/ros/foxy/setup.bash  # Configure ros2 environment variables
    mkdir -p ~/robot_ws/src # Create the workspace and src directory. If it already exists, you can ignore this step.
    cd ~/robot_ws/src
    ros2 pkg create --build - type ament_python camera_listener # Create a Python - type ros2 package
  3. Set up a ROS 2 Node and Subscribe to the Topic

    In the camera_listener package, create a Python file (for example, listener.py) to implement the ROS node for image acquisition:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import rclpy
from rclpy.node import Node
from sensor_msgs.msg import Image
from cv_bridge import CvBridge

class CameraListener(Node):

def __init__(self):
super().__init__('camera_listener')
self.bridge = CvBridge()
self.subscription = self.create_subscription(
Image,
'image_raw',
self.listener_callback,
10)
self.subscription

def listener_callback(self, msg):
cv_image = self.bridge.imgmsg_to_cv2(msg, desired_encoding='passthrough')
# Process the cv_image


def main(args = None):
rclpy.init(args = args)

camera_listener = CameraListener()

rclpy.spin(camera_listener)

# Destroy the node explicitly
camera_listener.destroy_node()
rclpy.shutdown()


if __name__ == '__main__':
main()
This code first imports the necessary modules, then defines a `CameraListener` class that inherits from `Node` and subscribes to the `image_raw` topic. Whenever a message is published to this topic, the `listener_callback` method will be triggered.
Note that the `imgmsg_to_cv2` function here is to convert the ROS image message to the OpenCV format.
  1. Run the Node

    Then you can run the newly created node in the following way:

    1
    2
    3
    4
    cd ~/robot_ws/
    colcon build --packages - select camera_listener # Compile the workspace
    source install/setup.bash # Set the environment variables
    ros2 run camera_listener listener.py # Run the node

The above are the steps to receive camera data. We need to ensure that when the OriginBot publishes image data, the newly created node can run normally and receive these data.

In this code, we also need to consider subsequent interactions with other systems. The listener_callback function will obtain the Image message from ROS and convert it to the cv2 image format. Inside this function, you can add operations such as sending the image to the server (using a Python - related WebSocket client library) or persisting it.

Message From You:

Please improve the listener_callback function. Use the paho module in it to publish data to the MQTT server. Ensure that after I run the node, data can be continuously sent to MQTT.

Message From ChatGPT:

First, you need to install the Paho MQTT library. Run the following command in your Python environment:

1
pip install paho - mqtt

Then, here is an example of using the paho mqtt client to publish image data to the MQTT server:

For ROS, we will convert the obtained image to JPEG format and encode it (because MQTT does not support directly sending binary data), and then send it through MQTT.

Modify the listener.py file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import cv2
import rclpy
from rclpy.node import Node
from sensor_msgs.msg import Image
from cv_bridge import CvBridge
import numpy as np
import paho.mqtt.client as mqtt
import base64

class CameraListener(Node):

def __init__(self):
super().__init__('camera_listener')
self.bridge = CvBridge()
self.subscription = self.create_subscription(
Image,
'image_raw',
self.listener_callback,
10)

# Initialize the MQTT client
self.client = mqtt.Client()
self.client.connect("localhost", 1883, 60) # If your mqtt server is running on a different machine, change localhost to its IP address.

def listener_callback(self, msg):
cv_image = self.bridge.imgmsg_to_cv2(msg, desired_encoding='passthrough')

# Convert the image to JPEG
ret, jpeg = cv2.imencode('.jpg', cv_image)

if not ret:
raise Exception('Could not encode image!')

b64_string = base64.b64encode(jpeg.tobytes()).decode('utf - 8')

# Publish the encoded string via MQTT
self.client.publish("robot/camera/image_raw" ,b64_string)

def main(args = None):
rclpy.init(args = args)

camera_listener = CameraListener()

rclpy.spin(camera_listener)

# Destroy the node explicitly
camera_listener.destroy_node()
rclpy.shutdown()


if __name__ == '__main__':
main()

This code processes the Image message obtained from ROS and converts it into a JPEG string to send to the MQTT server. Note that ‘localhost’ and 1883 should be replaced with the actual MQTT server address and port number.

To test this program locally, make sure you have installed a MQTT broker such as Mosquitto that provides MQTT services and it is running.

To start the node, you can compile and build the ROS workspace as provided before, and then use the ros2 run command to execute the node. Enter the following command in the terminal to see the images captured by the camera being continuously sent to the MQTT server:

1
2
3
4
cd ~/robot_ws/
colcon build --packages - select camera_listener
source install/setup.bash
ros2 run camera_listener listener.py

Message From You:

Before compilation, you should also improve the entry_points in setup.py of the camera_listener node.
Currently, it is like this. Please improve it.
entry_points={
‘console_scripts’: [
],
},

Message From ChatGPT:

In the setup.py file, we need to add the entry_points field so that ROS2 can find and run our Python script. Here is how to set listener.py as an executable command:

1
2
3
4
5
entry_points={
'console_scripts': [
'camera_listener = camera_listener.listener:main'
],
},

This code tells ROS2 to look for the main function in the listener module of the camera_listener package as the entry point for the camera_listener command.

Remember, you also need to ensure that the corresponding dependencies are defined in package.xml and there are corresponding installation rules in CMakeLists.txt or setup.cfg (depending on whether you use ament_cmake or ament_python). If you are using Python and ament_python, setup.py should be sufficient. The templates created by ros2 pkg create have already helped handle all these steps.

In the above conversations, I actually asked ChatGPT to develop a ros2 package to subscribe to image data and send it to the MQTT Server.

If you look closely at the conversation content, you will find that ChatGPT’s answers were initially incorrect. It was only after my continuous adjustments that a relatively good result was obtained.

For example, the listener_callback function it gave for the first time was just a shell with no actual content; I specified that it should use paho - mqtt to handle MQTT - related functions; and finally, I reminded it to improve the relevant configuration in setup.py of the package.

Message From You:

To complete this task, we can roughly break it down into the following steps:

  1. Receive Camera Data from the ROS 2 Node

    • You need to create a new ROS 2 node to subscribe to the image_raw topic. When image data is published, this node can receive it.
    • This node may need to use some ROS and computer - vision libraries (such as cv_bridge and OpenCV) to convert the ROS Image message into a format suitable for network transmission (such as JPEG or PNG).
  2. Establish Communication between the Server and the ROS Node

    • After the ROS node successfully acquires and processes the image data, the next step is to set up a listening service on another server in the same WiFi environment. You can use WebSocket to achieve this function because it supports real - time, full - duplex, and point - to - point connections.
    • In your ROS node, you need to add WebSocket client code to send the processed image data to the server.
  3. Set up the Django + DRF Server

    • On the server, create a Django project and use the Django REST framework (DRF) to build an API interface.
    • Start a WebSocket server. In the Django project, use a third - party library such as channels to create a WebSocket route to receive the image data sent from the ROS node.
  4. Store and Stream the Video

    • Usually, consecutive frames are organized into a video sequence. So after you have obtained each individual frame, you need to consider how to organize these frames and play them through a web page.
    • You can consider using a streaming protocol such as MPEG - DASH or HLS. I’m not sure if Django has such built - in functions, but there should be corresponding libraries available.
  5. VUE3 Front - end Display

    • Develop the front - end page using Vue.js and subscribe to the image stream captured by the camera through WebSocket.
    • Display the received images as a video. The HTML5 video player or other third - party video.js library can be used to meet this requirement.

This is the suggestion you gave me. You have completed the first step. Now please complete the second step: Establish communication between the server and the ROS node. Give a detailed description, steps, commands, and code. Remember to consider the relevance to all other steps at the same time.
In addition, in the first step, we adopted the method of sending data out through MQTT. So you should pay attention to this part when designing.

Here I repeated the solution ChatGPT gave me to help it recall. According to my experience, doing this can alleviate the problem of forgetting caused by the context length limit.

Message From ChatGPT:

Okay, it seems you have a