Robotics Technology Stack

Recently, I felt the need to organize the robotics technology stack to identify any deficiencies. It also serves as a review and commentary on my experiences over the past few years. I've always believed that engineering is a practical discipline. All of us stand on the shoulders of those before us to create a more interesting world. How to find useful tools and how to make good use of them are considerations for us in engineering. This is how we can proudly call ourselves "engineers." I have always thought that "engineer" is a great profession. We are people who can change the world by tightly linking science with everyday life, transforming human living.

This article is project-oriented, based on "first principles," and our ultimate goal is to produce a result, or a product, or achieve a goal. Therefore, using "backtracking," we start from the result and trace back the steps we need to take (these ideas are inspired by the book "Thinking Like a Rocket Scientist"). This article also does this, starting from the major categories of robotics and gradually approaching each technical detail. Many technologies may be outdated, but sometimes, the "intuition" inherent in each technology is timeless. From this, we can think laterally and come up with many different solutions. An example I can think of is a robotics control project I was involved in where the communication frequency was slow. There were many possible directions for optimization, such as embedded internal system scheduling or communication protocols, or the reception method of the upper computer. Looking back, a better solution would be to optimize the system scheduling and communication protocols of embedded devices, which would better squeeze the performance of the devices, facilitating future applications. The upper machine part could also be optimized, but to further applications in the future, such as real-time control, it is still necessary to find and solve the root cause of the problems.

1. Mechanical Design

A complete project starts with the mechanical part, followed by hardware, software, and so on. In my view, this is a complete robotics project. Good mechanical design paves the way for subsequent steps. Since I've mainly dealt with small robots, I haven't used knowledge from material mechanics or fluid dynamics, but I will catch up if needed in the future.

1.1 CAD

In robot design, CAD software is indispensable for designing and simulating mechanical structures. Recommended CAD software includes SolidWorks and Fusion 360.

SolidWorks

SolidWorks is a 3D CAD software widely used in the field of engineering design. In robot design, SolidWorks can be used for designing mechanical structures, including part modeling, assembly, and motion simulation. Solidworks is the software I use regularly, and I am quite familiar with the entire design process. Overall, I feel it's very good.

Fusion 360

Fusion 360 is a cloud-based 3D CAD software with powerful functions, suitable for mechanical design, industrial design, and other fields. Compared to SolidWorks, Fusion 360 is more suitable for individual users, with a lower price and more cloud-based features. I tried switching from SolidWorks to Fusion 360 because it is more user-friendly for individual users, but later I found it difficult to adapt to the change in design flow, so I gave up on the transition.

NX is a high-end 3D CAD software developed by Siemens, known for its powerful functions and high precision. In robot design, NX can be used for designing complex mechanical structures, including part modeling, assembly, and motion simulation. NX is a professional software, and I have not used it yet, but I have seen it used in some high-end projects.

1.2 3D Printing

The mainstream 3D printing methods currently used in laboratories or at home are FDM (Fused Deposition Modeling) and SLA (Stereolithography). FDM is a melt deposition modeling technology that builds 3D models by melting plastic filament and stacking the material layer by layer. SLA is a photopolymerization molding technology that solidifies photoreactive resin layer by layer using ultraviolet light. FDM printers are more common and cheaper. SLA printers are more expensive but offer higher printing precision, usually depending on the accuracy of the LCD panel. However, SLA printers are slower and require special photoreactive resin, and post-print cleaning and curing, making them more cumbersome to use. In contrast, FDM printers are more suitable for individual users and small businesses. Since 2017, I have mainly used four types of FDM printers: DIY, Ultimaker, Anycubic, and Bamboo Lab.

Print materials mainly include PLA, ABS, PETG, TPU, etc., among which PLA is the most commonly used print material, cheap and easy to print, suitable for beginners. ABS is an engineering plastic with high strength and heat resistance, suitable for printing mechanical structures and other parts. PETG is a new type of print material with high transparency and heat resistance, suitable for printing shells and other parts. TPU is an elastic material with high flexibility and wear resistance, suitable for printing tires and other parts. Ultimaker also provides a PVA water-soluble support material.

Ultimaker

Ultimaker is a well-known 3D printer manufacturer, known for the high quality and stable performance of its products. In the field ofrobotics, Ultimaker's 3D printers can be used to print mechanical structures, shells, and other parts.

In the lab, we use Ultimaker 2+, Ultimaker 3 extended, and Ultimaker S5. Models three and above use dual nozzles, which can use PVA water-soluble support material, but this material is quite temperature-sensitive, so it often clogs the nozzle or extrudes unevenly, so the experience is not very good. Additionally, Ultimaker uses its own 2.85mm materials, and Ultimaker printers are expensive, such as the Ultimaker S5, which is priced around 7000 euros. Overall, I do not recommend Ultimaker printers; they are expensive and offer a mediocre experience.

DIY

In 2017, I used a DIY 3D printer, which is cheap and suitable for beginners. However, DIY 3D printers require self-assembly, and their printing accuracy and stability are not suitable for long-term use. A 3D printer is more of a tool, so printing accuracy and stability are very important. A good printer can have better matching accuracy and better printing results. Clearly, DIY printers are not suitable for long-term use, so I do not recommend them.

Anycubic

Anycubic mega S3 was my first commercial 3D printer, cheap and with good printing results. In previous years, it was a good choice. However, based on my experience, its printing accuracy and stability are not as good as Ultimaker, but it is also not easy to break. I have been using it personally for about four or five years, and it has been able to work stably. Considering the price, Anycubic was a good choice before Bambu Lab appeared.

Bambu Lab

Bambu Lab printers are currently some of the hottest printers. They have been very popular on domestic and foreign video websites, such as B and YouTube, and I have also seen them at Embedded World and Productronica exhibitions. Their marketing is well done, but ultimately, they still rely on technology. I bought an A1 mini about two months ago, and the feel of it in my hands is really good, with fast printing speed and high precision. So I think a good product needs a complete ecosystem, from mechanical design to software ecology. I can feel from Bambu's products that software can maximize the use of hardware, and all parts can work very well together, truly deserving to be called a good product. At present, I am very optimistic about the development of Bambu Lab.

2. Hardware Design

In the hardware design phase of a robotics project, we need to consider integrating various electronic components and systems to achieve the robot's functions and performance. This part mainly involves the choice of microcontrollers, circuit board design (using EDA tools), firmware development, and the application of sensors and communication technology.

2.1 MCU

Arduino

Arduino is an open-source hardware platform aimed at beginners, especially suitable for simple robot projects. It has a rich community resource and numerous libraries, allowing for quick integration of sensors and actuator control. However, its processing power is limited and not suitable for complex or computation-intensive applications. In my projects, Arduino is rarely used because of its limited processing power, low main frequency, and poor scalability, making it unsuitable for complex robot projects. However, for beginners, Arduino is a good entry platform. Moreover, Arduino's ecosystem is very complete, with many libraries and tutorials that can help beginners get started quickly. Many open-source projects are also based on the Arduino architecture, making it a good choice for beginners.

Raspberry Pi

As a mini-computer, Raspberry Pi has strong processing power and a rich interface, making it very suitable for robot projects requiring advanced video processing and data handling. However, it consumes more power and is not as stable in real-time control as dedicated microcontrollers.

I have mainly used three versions of Raspberry Pi: Raspberry Pi 3B, Raspberry Pi 4B, and Raspberry Pi Zero. Raspberry Pi 3B was the first Raspberry Pi I used, mainly for small projects like home servers and smart homes. However, due to its limited processing power, I later switched to Raspberry Pi 4B.

Raspberry Pi 4B 8Gb is the Raspberry Pi I am currently using, mainly for small robot projects such as smart cars and smart homes. Considering performance, I tried using Ubuntu's desktop version but found it was not stable enough to run, so I later switched to Ubuntu server version, which is much more stable and can run ROS noetic version stably. This project link

Raspberry Pi Zero was used previously due to size limitations, and since it was a Hackathon competition with limited development time and some customization needs, I used Raspberry Pi Zero. This project link .

STM32

The STM32 series of microcontrollers are favored for their high performance and low power consumption, suitable for robot projects requiring precise control and real-time processing. However, their development environment is relatively complex, and beginners may need a longer learning curve.

STM32f103c8t6 and STM32f103zet6 are two STM32 microcontrollers I have used, and they are also my entry-level microcontrollers. I have also had projects based on STM32, such as a quadcopter based on STM32. This project link . However, in later projects, I found that STM32's processing power is limited, there is not much peripheral support, and it is relatively expensive, making it not very suitable for some small projects. But STM32's ecosystem is very complete, with many libraries and tutorials that can help beginners get started quickly. Many open-source projects are also based on the STM32 architecture, so STM32 is a good choice for beginners.

ESP32

ESP32 provides Wi-Fi and Bluetooth functionality, making it very suitable for projects requiring remote control and data transmission. Its energy efficiency is relatively high, but its processing power and resources are not as professional as other microcontrollers. However, overall, ESP32 is still a good choice and is the controller I use the most. Based on the PlatformIO development environment, ESP32 development is well supported. Moreover, with its dual-core 120Mhz processor, ESP32 can perform well in embedded tasks. Combined with Wi-Fi and Bluetooth support, it can communicate well with ROS, making ESP32 a great choice in robot projects.

rk3566

I discovered the rk3566 chip at LCSC, a high-performance chip that supports Linux systems and can support ROS development. So in future projects, I plan to try using this chip to see how it performs. Moreover, since the price is very low, about 66RMB (around 2023, during the chip shortage, STM32f103c8t6 was sold for 40+ RMB), I will try using this chip in future projects to see how it performs.

2.2 EDA

Altium Designer

Altium Designer is a professional-level circuit design software that provides powerful design functions and simulation tools, very suitable for complex electronic projects. However, due to its high licensing costs and complex operating interface, it may not be suitable for beginners.

In the Drone-Mercury project, Altium Designer was used. I found that due to the incomplete device library, I had to manually add many devices, which made the design time longer. Moreover, Altium Designer's operating interface is quite complex and requires a certain learning curve. However, Altium Designer provides powerful simulation tools that can help designers verify the performance and stability of circuits. Overall, Altium Designer is suitable for complex electronic projects but may not be suitable for beginners and some DIY projects.

LCSC EDA

LCSC EDA is an easy-to-use online EDA tool, especially suitable for rapid prototyping and small-scale production. Its online features make collaboration easy, but it may not be as comprehensive in functionality as professional software. In recent projects over the past two years, I have mainly used LCSC EDA because its operating interface is simple and easy to use, and it provides a rich device library, allowing for quick circuit design. LCSC EDA also provides an online PCB manufacturing service, which can help users quickly manufacture PCB boards. Moreover, LCSC EDA is relatively low-priced, suitable for individual users and small businesses. Overall, LCSC EDA is suitable for rapid prototyping and small-scale production but may not be suitable for complex electronic projects.

2.3 Firmware

Arduino IDE

Arduino IDE is an integrated development environment for writing and uploading Arduino code. In robot design, Arduino IDE can be used to write firmware for control systems, including sensor data collection and actuator control. However, Arduino IDE's functionality is relatively limited and not suitable for complex projects.

STM HAL Library

The STM HAL library is a hardware abstraction layer library provided by ST for STM32. In robot design, the STM HAL library can be used to write STM32 firmware, including peripheral initialization, interrupt handling, etc. The STM HAL library provides a rich API that can help developers quickly write STM32 firmware. However, the STM HAL library has a steep learning curve and requires a certain learning cost.

STM32CubeIDE

STM32CubeIDE is an integrated development environment for writing and uploading STM32 code. In robot design, STM32CubeIDE can be used to write firmware for real-time control systems, including attitude control, motion control, etc. Due to its powerful visualization and debugging features, STM32CubeIDE can improve development efficiency. Moreover, it can generate code in combination with CubeMX, reducing development time.

ESP-IDF

ESP-IDF is a development

framework for writing and uploading ESP32 code. In robot design, ESP-IDF can be used to write firmware for communication systems, including wireless communication, data transmission, etc. In a project I worked on, I used ESP-IDF and found that its Cpp code is well written, encapsulating some low-level hardware operations well, which greatly reduced development time during the development process. Moreover, ESP-IDF provides a rich API that can help developers quickly write ESP32 firmware. However, ESP-IDF indeed has a high entry barrier and requires a certain C++ foundation and some embedded development experience.

PlatformIO

PlatformIO is an open-source IoT development platform that supports multiple development boards and environments. In robot design, PlatformIO can be used to manage firmware development, compilation, and uploading, improving development efficiency. Since they are small or medium-sized projects, I mainly use PlatformIO, which has high integration and supports ESP32 development well. Moreover, PlatformIO supports multiple development boards and environments, does not require complex configuration, and is easy to get started with. Overall, PlatformIO is suitable for small and medium-sized projects and can improve development efficiency. For personal projects, PlatformIO is a good choice.

2.4 Sensors

Sensors are an important part of robot design, used to perceive the robot's surrounding environment and state. Common sensors include IMU, LiDAR, cameras, ultrasonic sensors, force sensors, bend sensors, temperature and humidity sensors, stretch sensors, etc.

IMU

IMU (Inertial Measurement Unit) is an inertial measurement unit used to measure the robot's attitude and motion. In robot design, IMU can be used for attitude control, motion control, etc. Common IMUs include MPU6050, MPU9250, BNO055, etc.

LiDAR

LiDAR is a sensor used to measure the robot's surrounding environment. In robot design, LiDAR can be used for environmental perception, obstacle avoidance, etc. However, LiDAR is expensive, for example, the cheapest LiDAR costs about 200 EUR, making it unsuitable for small projects.

Camera & RGBD

Cameras are sensors used to capture image information around the robot. In robot design, cameras can be used for image processing, target recognition, etc. Intel Realsense D435i is a commonly used camera, which can be used for SLAM, target recognition, etc.

Ultrasonic Sensor

Ultrasonic sensors are sensors used to measure the distance between the robot and obstacles. In robot design, ultrasonic sensors can be used for obstacle avoidance, positioning, etc. However, their accuracy and ranging range are limited.

Force Sensor

Force sensors are sensors used to measure the force applied by the robot. In robot design, force sensors can be used for force control, force feedback, etc. Main force sensors include pressure sensors, strain sensors, and more advanced 6-axis force sensors, which can measure force and torque. However, 6-axis force sensors are expensive, and due to their principle, it is difficult to miniaturize them, making them a future development direction.

Bend Sensor

Bend sensors are relatively rare sensors used to measure the robot's bending degree. In robot design, bend sensors can be used for posture control, motion control, etc. Main bend sensors include bend resistance, bend capacitance, etc., which can measure bending degrees.

Temperature and Humidity Sensor

Temperature and humidity sensors are sensors used to measure the temperature and humidity around the robot. In robot design, temperature and humidity sensors can be used for environmental perception, control systems, etc. Common temperature and humidity sensors include DHT11, DHT22, etc.

Stretch Sensor

Stretch sensors are sensors used to measure the robot's stretch degree. In robot design, stretch sensors can be used for force control, force feedback, etc. Main stretch sensors include stretch resistance, stretch capacitance, etc., which can measure stretch degrees.

2.5 Communication

In robot design, communication technology is a key component, allowing the robot to exchange data with external devices, other robots, or networks.

Bluetooth Bluetooth technology is suitable for short-distance communication, low power consumption, and relatively low cost. It is very suitable for interaction between handheld devices and robots or low data rate transmission. However, Bluetooth's communication distance and data transmission rate are low, which may not be suitable for applications requiring long-distance or high-speed data transmission.

WIFI WIFI provides higher data transmission speed and better communication distance, suitable for applications requiring large data transmission or video streaming. In robot control, WIFI is often used for remote control, data transmission, etc. Its stability and speed are relatively good, and communication can be achieved through TCP/IP or UDP protocols, achieving good communication effects.

LoRa LoRa is a low-power wide-area network technology, especially suitable for long-distance, low-power communication needs. It can support communication distances of several kilometers, making it

very suitable for large-scale monitoring and control applications. LoRa's data transmission rate is low, making it unsuitable for high-speed data transmission needs, such as video transmission.

Zigbee Zigbee is a low-power short-distance wireless communication technology. Its network configuration is flexible, supporting connections with a large number of devices, making it suitable for building complex sensor networks and automation systems. Compared to other communication technologies, Zigbee's data transmission rate is low, and its communication distance is also short, which may not be suitable for all types of robot applications.

2.6 Real-time OS

In hardware design, real-time operating systems (RTOS) are a key technology used for multitasking processing and real-time control. RTOS can provide task scheduling, interrupt handling, memory management, etc., to ensure the stability and reliability of the robot.

Common RTOS include FreeRTOS, Zephyr, RT-Thread, etc. In my projects, I mainly use FreeRTOS, an open-source RTOS that provides a rich API and sample code, helping developers quickly build real-time control systems. Both Arduino IDE and PlatformIO support FreeRTOS, making it very convenient to integrate into projects.

3. Software

3.1 ROS/ROS2

ROS (Robot Operating System) is a robot operating system that provides a series of libraries and tools to simplify the development of robot software. In robot design, ROS can be used to build the robot's software system, including processing sensor data, motion control, etc.

Most of my projects are based on ROS. Overall, ROS is a good communication framework, not an operating system as its name suggests, which might be misunderstood. However, ROS supports robot development well. But ROS also has some drawbacks, such as insufficient hardware support, for example, ROS serial projects based on wifi, which may disconnect during long communications, possibly caused by ROS's communication mechanism.

ROS2 is the next generation of ROS. Due to some shortcomings of ROS1, ROS2 has made many improvements, such as better communication mechanisms (using middleware), better hardware support, etc. (supporting embedded and win10). But based on actual usage experience, the migration from ROS1 to ROS2 is not easy, as some APIs and features of ROS2 are not compatible with ROS1, so it requires relearning and development. Although ROS2 has been iterated for a long time, there are still many bugs, such as problems during compilation, or some APIs are not convenient to use. But since Ubuntu 22 no longer supports ROS1, switching to ROS2 is an inevitable choice. However, according to the update of Nvidia Issac sim 4.0.0 version, the transition to ROS2 may also be imminent.

3.2 Simulation

Gazebo

Gazebo is an open-source robot simulation software that provides a rich model library and simulation environment, which can be used to simulate the robot's movement and behavior. In robot design, Gazebo can be used to verify the robot's control algorithms, path planning, etc.

pybullet

pybullet is an open-source physics engine that provides a rich model library and simulation environment, which can be used to simulate the robot's movement and behavior. In robot design, pybullet can be used to verify the robot's control algorithms, path planning, etc. So for verifying robot control algorithms, such as wbc, ik, etc., pybullet is a good choice.

mujoco

mujoco is a free physics engine that provides a rich model library and simulation environment (Mujoco was bought by DeepMind in 2021, and then DeepMind released it as free software), which can be used to simulate the robot's movement and behavior. In robot design, mujoco can be used to verify the robot's control algorithms, path planning, etc. More and more projects now are based on the Mujoco.

unity

Although Unity is primarily designed for game development, its application in the field of robot simulation is also increasing. Unity's powerful graphics processing capability and user-friendly interface make creating realistic simulation environments possible. The disadvantage of Unity is that, compared to traditional robot simulation software, it lacks in physical accuracy and real-time performance. But mainly used for testing autonomous driving, reinforcement learning, etc., so in robot simulation, Unity is a good choice.

Nvidia Isaac Sim

With the launch of Issac Sim 4.0.0 version, Orbit has also officially been renamed Isaac Lab. Here is the official introduction

Isaac Lab is an open-source lightweight reference application built on the Isaac Sim platform, playing an important role in robot basic model training. It supports reinforcement learning, imitation learning, and transfer learning. It can train various robot instances for developers to explore design and functionality.
Integration with VSCode and compatibility checker is easy to use, providing multi-GPU support for reinforcement learning, improving performance through RTX sensor tiling rendering, optimizing cache and shader management.
Easy to use PIP installation and wizard for importing robots, etc.
Performance improvement, synthetic

data generation (SDG) speed increased by 80% - Supports COCO format and new SDG format for custom writers for pose estimation. - ROS 2 launched a version with better performance for end-to-end workflows and image-based publishers. - Supports more built-in robots: including 1X Neo, Unitree H1, Agility Digit, Fourier Intelligence GR1, Sanctuary A1 Phoenix, and XiaoPeng PX5, a series of humanoid robots. Also, Universal Robots UR20 and UR30 and Boston Dynamics Spot.

Nvidia's layout in the robot direction is evident, and with more and more major companies (Nvidia, Apple, Meta, etc.) joining the competition in the robot direction, the attention of major companies to robots can also be seen. I will specifically write an article about Humanoid Robots in the future to explore the future development direction of robots.

3.3 3D Modeling

blender

blender is an open-source 3D modeling software that provides a rich model library and modeling tools, which can be used to design the appearance and structure of robots. In robot design, blender can be used for modeling, rendering, etc.

Additionally, USD (Universal Scene Description) is an open scene description file format developed by Pixar, used to describe the geometry, materials, animation, etc., of 3D scenes. USD can be used to build complex 3D scenes, supporting multiple software and engines, such as blender, unity, unreal, etc. In robot design, USD can be used to build the robot's 3D model, animation, etc. USD is more flexible than URDF, supporting more functions, such as animation, materials, etc. In Sim2Real research, USD is a good choice.

4. Algorithms

Algorithms are the core part of robot design, determining the robot's perception, decision-making, and control capabilities. Common algorithms include perception, kinematics and dynamics, robot control, path planning and motion planning, decision-making and autonomy, multi-robot systems, etc.

Due to the wide variety of algorithms, only some common algorithms are listed here, along with corresponding project links for more details.

4.1. Perception

Perception, as I understand it, is how to process and use sensor data to obtain information about the environment around the robot. Perception includes sensor data processing, computer vision, 3D perception, etc. Depending on the type of sensors and application scenarios, perception can use different methods and algorithms.

4.1.1 Sensor Data Processing

Sensor Fusion
- Kalman Filter
- Extended Kalman Filter (EKF)
- Particle Filter
Data Preprocessing
- Noise Filtering
- Data Calibration

4.1.2 Computer Vision

Image Processing
- Edge Detection
- Feature Extraction
Object Recognition
- Object Detection (such as YOLO, SSD)
- Object Recognition (such as VGG, ResNet)
Deep Learning
- Convolutional Neural Networks (CNN)
- Generative Adversarial Networks (GAN)

4.1.3 3D Perception

Point Cloud Processing
- Point Cloud Registration (ICP Algorithm)
- Point Cloud Segmentation
Depth Camera
- Stereo Vision
- Deep Learning Methods (such as PointNet)

RoboCup@Home Challenge

4.2. Kinematics and Dynamics

In the design of robotic arms and mobile robots, kinematics and dynamics are very important parts. Kinematics describes the motion rules of the robot, while dynamics describes the mechanical characteristics of the robot. The study of kinematics and dynamics can help us design the robot's control system and achieve precise motion and force control. #### 4.2.1 Kinematics - Forward Kinematics - Mapping from joint space to task space - Inverse Kinematics - Mapping from task space to joint space - Constrained Kinematics - Cartesian space constraints

4.2.2 Dynamics

Dynamics Modeling
- Newton-Euler Method
- Lagrange Method
Dynamics Control
- Force Control
- Impedance Control

4.2.3 Robotics Kinematics and Dynamics Projects

UR10 Impedance control and Haptic Force Feedback via Smart Knob

4.3. Robot Control

Robot control is the core part of robot design, determining the robot's movement and behavior. Robot control can be divided into classic control and advanced control, and different control methods can be chosen according to different application scenarios and needs. Among them, MPC (Model Predictive Control

) is widely used in humanoid robots and autonomous driving fields. #### 4.3.1 Classic Control - PID Control - Proportional-Integral-Derivative Controller - Adaptive Control - Adaptive PID Control - Model Reference Adaptive Control (MRAC)

4.3.2 Advanced Control

Optimal Control
- Linear Quadratic Regulator (LQR)
- Model Predictive Control (MPC)
Nonlinear Control
- Sliding Mode Control
- Feedback Linearization

4.3.3 Robot Control Projects

Human-Robot Joint Synchronization Teleoperation

4.4. Path Planning and Motion Planning

Path planning and motion planning determine the robot's motion trajectory and behavior. Path planning describes how the robot moves from the starting point to the endpoint, and motion planning describes how the robot moves along the given path. The study of path planning and motion planning can help us design the robot's autonomous navigation and motion control systems. #### 4.4.1 Path Planning - Graph Search Algorithms - A* Algorithm - Dijkstra Algorithm - Sampling-Based Methods - Rapidly-Exploring Random Tree (RRT) - RRT*

4.4.2 Motion Planning

Trajectory Generation
- Cubic Spline Interpolation
- Polynomial Trajectories
Constrained Optimization
- Path Optimization under Dynamics Constraints
- Time-Optimal Control

4.4.3 Path Planning and Motion Planning Projects

SLAM mapping and navigation simulation project

4.5. Decision Making and Autonomy

Decision-making and autonomy are advanced parts of robot design, determining the robot's intelligence and autonomy. The study of decision-making and autonomy can help us design the robot's autonomous navigation and task execution systems. With the development of deep learning and reinforcement learning, research on decision-making and autonomy is also advancing.

I have recently read many papers in this direction, and I might specifically organize a review to introduce the development of this direction in the future.

4.5.1 State Estimation

Bayesian Filtering
- Markov Localization
- Monte Carlo Localization

4.5.2 Reinforcement Learning

Value Function Methods
- Q-learning
- Deep Q Network (DQN)
Policy Gradient Methods
- Policy Gradient
- Proximal Policy Optimization (PPO)
Actor-Critic Methods
- Deep Deterministic Policy Gradient (DDPG)
- Asynchronous Advantage Actor-Critic (A3C)
- Advantage Actor-Critic (A2C)
Multi-Agent Reinforcement Learning
- MADDPG
- MARL
- AlphaZero
Reinforcement Learning from Human Feedback

4.5.4 Imitation Learning

Behavior Cloning
Inverse Reinforcement Learning

4.5.5 Projects

Reinforcement Learning Review

5. Tools

In this section, we provide a brief introduction to some commonly used tools, including IDEs, simulation software, virtual environments, and large language model (LLM) tools.

5.1 IDE (Integrated Development Environment)

IDEs are essential tools in software development, significantly improving development efficiency by providing an all-in-one environment for coding, debugging, and running applications. There are various IDEs suitable for different programming languages and development needs. Below are introductions to three IDEs I frequently use: Visual Studio Code, PyCharm, and CLion.

5.1.1 Visual Studio Code

Visual Studio Code (VS Code) is a free, open-source code editor developed by Microsoft. It supports multiple programming languages and offers a wide range of extensions and plugins. The advantages of VS Code include cross-platform support (Windows, macOS, and Linux), fast startup, and good performance. It has numerous extensions and plugins to meet various development needs and built-in Git support for version control. VS Code also supports debugging for multiple languages. However, many advanced features rely on plugins, which can complicate plugin management, and there might be performance issues with large projects. VS Code is suitable for developers of all kinds, especially those who need quick setup and flexible configuration.

Personally, I really like VS Code. It can meet almost all my needs, and when combined with various plugins, it becomes a very powerful productivity tool. Some highly recommended plugins include Copilot, Markdown, PlatformIO, Git, and SSH.

5.1.2 PyCharm

PyCharm, developed by JetBrains, is an IDE specifically designed for Python development, offering powerful code analysis and debugging features. PyCharm's advantages include intelligent code completion and analysis, improving development efficiency. It has built-in database tools, a debugger, and testing tools for easy project management and debugging. It also supports scientific computing and data analysis well, integrating tools like Jupyter Notebook, and supports Python web frameworks like Django and Flask. However, compared to lightweight editors, PyCharm starts slowly and consumes a lot of memory. Additionally, the community version has limited features, and the professional version requires a paid license. PyCharm is suitable for Python developers, especially those dealing with complex projects and scientific computing. For general projects, the community version can be a good start.

5.1.3 CLion

CLion, also developed by JetBrains, is a cross-platform C/C++ IDE offering intelligent code completion, refactoring, and navigation features. The advantages of CLion include cross-platform support (Windows, macOS, and Linux), powerful code analysis and completion, and built-in GDB and LLDB debuggers for easy code debugging. CLion also fully supports CMake, simplifying project building and management. However, CLion starts slowly, has high memory usage, and requires a paid license with a limited free trial. CLion is suitable for professional C/C++ developers, especially those dealing with large projects and complex debugging tasks. For personal projects, free options like Visual Studio Code with C++ plugins can be considered.

5.2 MATLAB & Simulink

MATLAB, developed by MathWorks, is a high-level programming language and interactive environment widely used in scientific computing, engineering design, and simulation. Simulink, an add-on to MATLAB, is used for multi-domain simulation and model-based design.

MATLAB is an excellent tool, holding an irreplaceable position in the field of control simulation. However, one must consider the licensing issue, as many universities and large companies may not have the necessary licenses, which is also a challenge faced by PyCharm and CLion.

5.3 Git

Git is a distributed version control system used to track source code changes, facilitate collaborative development, and manage projects. The advantages of Git include its distributed nature, allowing each developer to have a complete repository and work offline; efficient branch and merge operations, making it suitable for large projects and team collaboration; and its widespread use, with extensive community resources and tool support. However, Git has a steep learning curve for beginners, with complex command-line operations and concepts, and managing complex histories may require additional tools and techniques. It is highly recommended for all developers to learn and use Git, along with platforms like GitHub and GitLab, to greatly improve team collaboration efficiency. Beginners can start with GUI tools (e.g., Sourcetree) and gradually familiarize themselves with command-line operations.

5.4 Virtual Environment

Virtual environments are used to isolate project dependencies, ensuring compatibility and independence between different projects.

5.4.1 Conda/Anaconda

Conda is an open-source package and environment management system, while Anaconda is a Python/R distribution that includes Conda and is preloaded with numerous scientific computing packages. The advantages of Conda include simplified package installation and management, resolving dependency conflicts, and easy creation and switching of virtual environments to isolate project dependencies. Anaconda, with its preloaded scientific computing and data analysis packages, further enhances this. However, Anaconda comes with many packages, resulting in a large installation size, and in some regions, the default source can be slow, requiring mirror configuration. For developers dealing with multiple Python projects and scientific computing, Conda and Anaconda are excellent choices. For lightweight needs, Miniconda can be considered.

5.4.2 Docker

Docker is an open-source containerization platform used for automating application deployment and management. Its advantages include providing isolated environments for each container, ensuring project separation, maintaining consistent development and production environments across different operating systems, and supporting container orchestration tools (e.g., Kubernetes) for large-scale deployment. However, Docker requires learning the basics of Dockerfile writing and container management, with a certain learning curve, and in some cases, containerization may introduce performance overhead. Docker is suitable for developers needing cross-platform deployment and large-scale applications. Beginners can start with simple Docker container management and gradually learn advanced features.

5.4.3 WSL/WSL2

Windows Subsystem for Linux (WSL) is a compatibility layer for running Linux binary executables on Windows, with WSL2 being its upgraded version, providing higher performance and a full Linux kernel. The advantages of WSL include seamless running of Linux tools and scripts on Windows without the overhead of a virtual machine, WSL2 offering performance close to native Linux, and convenience for Linux development on Windows. However, there may be compatibility issues with some low-level system functions compared to native Linux, and it requires certain system resources, especially WSL2. WSL is suitable for users developing on Linux while using Windows, especially those needing to switch frequently between the two systems. One can start with WSL1 and gradually transition to WSL2.

5.4.4 VMware/VirtualBox

VMware and VirtualBox are two popular virtual machine software used to run multiple operating systems on a single physical computer. Their advantages include complete isolation, with each virtual machine having its own operating system and resources, high flexibility to run various operating systems and test different environments, and snapshot functionality to easily revert to previous states. However, they require significant memory and storage resources, and performance may decrease compared to native operation. VMware and VirtualBox are suitable for users needing to test multiple operating systems and simulate complex environments. For systems with limited resources, lightweight virtualization solutions can be considered.

5.5 LLM (Large Language Models)

Large language model-based robotics control is currently in the research stage, making replication and testing difficult. However, some large model applications are worth trying, offering interesting and practical tools.

5.5.1 OLLAMA

Ollama is an open-source large language model service tool that allows users to easily deploy and use large-scale pre-trained models on their own hardware environments. Ollama's main feature is deploying and managing large language models (LLMs) within Docker containers, enabling users to quickly run these models locally. Ollama's deployment is straightforward, requiring only a few simple steps to run large models locally.

5.5.2 Continue

Continue enables you to easily create your own coding assistant directly inside Visual Studio Code and JetBrains using open-source LLMs. All this can run entirely on your own laptop or have Ollama deployed on a server to remotely power code completion and chat experiences based on your needs. The advantage of Continue lies in its flexibility and ease of use, allowing developers to seamlessly integrate coding assistants in local or remote environments. However, beginners may need some time to get familiar with configuring and using these tools. For developers needing intelligent assistants during development, Continue offers a robust solution, serving as a localized alternative to GitHub Copilot.

5.5.3 AnythingLLM

AnythingLLM can quickly establish a localized knowledge base framework. Below is information from the official website:

AnythingLLM is a full-stack application where you can use commercial off-the-shelf LLMs or popular open-source LLMs and vectorDB solutions to build a private ChatGPT. You can run it locally or host it remotely, enabling intelligent conversations with any documents you provide. AnythingLLM organizes your documents into objects called workspaces. Workspaces function similarly to threads but with added document containerization. Workspaces can share documents but do not communicate with each other, keeping each workspace's context clean.

Some main features of AnythingLLM include: - Multi-user instance support and permissioning - Agents inside your workspace (browse the web, run code, etc.) - Custom embeddable chat widget for your website - Multiple document type support (PDF, TXT, DOCX, etc.) - Manage documents in your vector database from a simple UI - Two chat modes: conversation and query. Conversation retains previous questions and amendments, while query is simple QA against your documents - In-chat citations - 100% cloud deployment ready - "Bring your own LLM" model support - Extremely efficient cost-saving measures for managing very large documents. You'll never pay to embed a massive document or transcript more than once, making it 90% more cost-effective than other document chatbot solutions - Full developer API for custom integrations

AnythingLLM's strengths lie in its versatility and flexibility, allowing users to customize according to specific needs and run efficiently in both local and remote environments. However, due to its rich features, beginners may need some time to familiarize themselves and configure these features. For developers needing intelligent document processing and multi-user support, AnythingLLM is a powerful tool.

xiangyu fu's blog