Multiple MX3 device usage

Hello Everyone,

The last time I posted here was about my memryx MX3 unit being bricked. Since then I have bought a dozen more units of MX3. I’m wondering how do I utilize multiple MX3 units (4 units) in a PC? Do i need to compile it with this argument ‘-c 16’ for 16 chips or is that not correct? I have tried it and when trying to run the DFP it gave me error saying that I’m running the DFP with incorrect number of chip. Any help would be appreciated thankss

Hi @ronne991 , it depends what you want to do with the extra modules.

If you’re looking to increase the performance of a single DFP (compiled to 4 chips, the default), just supply a list of devices to the Python/C++ constructors. The runtime will download the DFP to each of the listed devices and automatically balance the load across them. For example, to use the first 5 M.2 modules, set device_ids={0,1,2,3,4}.

If you have multiple different (4-chip) DFPs you want to run in parallel, the approach is similar: specify the devices for each DFP in the constructors. For example, if you have DFPs A, B, and C, you could assign A to device_ids={0}, B to device_ids={1}, and C to device_ids={2}. These can also have multiple in the list if you want to increase that DFP’s performance. For example: A→{0}, B→{1}, C→{2,3}.

Another tip: when pushing for max performance, you may want to look into Local Mode, as it can help remove some runtime overhead.

On the other hand if the goal is to run models that are too large for a single 4-chip M.2, the process is a little more manual. You’ll have to use Manual Cropping to cut your model into multiple DFPs. Once these individual DFPs are made, you can distribute them to different devices like above, and connect the intermediate feature maps in your application (output of A → input of B, etc.).

Hey Tim, thanks for the reply. It’s running great :+1:
I’m running face detection + recognition with 10 ip cameras and it’s good now

2 Likes