gMidSurf: Hierarchical GPU-based Mid-surface Abstraction for Thin-walled CAD Models

by Li Ye1 , Xinhang Zhou1 , Xingyu Yang1 , Peng Fan1 , Ruofeng Tong1 , Hailong Li2 , Peng Du1 , and Min Tang1, *

1 - Zhejiang University, Hangzhou, 310007, China

2 - Shenzhen Poisson Software Co., Ltd., Shenzhen, 518129, China

* - Corresponding Author

Abstract

Mid-surface abstraction is essential for finite element analysis of thin-walled CAD models, yet existing face pairing-based methods suffer from quadratic complexity and CPU-bound bottlenecks, limiting scalability for variable-thickness models. We present gMidSurf, a GPU-accelerated pipeline that transforms the two computational bottlenecks in mid-surface abstraction (face pairing and mid-point generation) into massively parallel operations. For face pairing, we introduce a hierarchical filtering strategy that progressively culls candidate pairs through three GPU-optimized gates: normal compatibility, simplified overlap criterion, and LBVH-based distance queries, reducing the search space by 10–100× while maintaining cache coherence. For mid-point generation, we employ parallel distance dilation followed by bracket-and-bisect refinement for precise equidistant point localization. This method handles variable-thickness models with complex surfaces through complete dilation, thereby avoiding gaps and truncations that occur in previous methods. Experimental results on real-world benchmarks demonstrate that gMidSurf achieves 4.2×–18.5× speedups in face pairing and 4.8×–9.8× in mid-point generation compared to CPU implementations, yielding 5×–15× acceleration on a commodity GPU (NVIDIA RTX 5090D) compared to state-of-the-art methods while maintaining geometric accuracy.
 

 

Overview: The input thin-walled model is first discretized into triangular meshes on the GPU. Then (a) hierarchical GPU-based face pairing applies three progressive filtering gates (normal, overlap, and distance) to prune candidate face pairs by 10–100× and identify the face group pairs (FGPs), abstracting 1-1 face pairs into n-n FGPs through parallel bipartite graph optimization. Next, (b) GPU-based mid-point generation produces initial mid-points via parallel distance dilation and refines them to precise equidistant points through a bracket-and-bisect kernel. Finally, the model undergoes surface fitting and trimming operations to determine the boundaries of the mid-surface, yielding the final output mid-surface.


 

Pipeline: The gMidSurf pipeline keeps geometry resident on the GPU through both core stages. Two computational bottlenecks — hierarchical GPU-based face pairing and dilation-based precise mid-point generation — are recast as massively parallel operations, while surface fitting and trimming run on the host. This end-to-end design preserves B-Rep modeling intent, supports both 1-1 and n-n face pairs, and handles constant- and variable-thickness regions robustly.

Technical Contributions

  • Full GPU-based face pairing with hierarchical culling. Normal, overlap, and distance tests are cast as GPU-friendly gates backed by LBVH traversal, pruning candidates by 10×–100× before fine checks and yielding near-linear scaling with face count.
  • Branchless mid-point solver for variable-thickness. A bracket-and-bisect kernel equalizes distances robustly across planes, quadrics, and free-form NURBS while minimizing warp divergence and maximizing throughput.
  • End-to-end GPU design that preserves B-Rep intent. Geometry stays resident on device through face pairing and mid-point generation, supports 1-1 and n-n face pairs, and interoperates with trimming on the host to balance robustness and speed.
  • Extensive evaluation on industrial-style models. Stage-wise and end-to-end speedups over a tuned multi-core CPU baseline and previous methods, with more accurate results on variable-thickness parts and complex free-form models.

Key Results

  • Hierarchical filtering gates reduce the candidate face-pair search space by 10×–100×.
  • 4.2×–18.5× speedup for face pairing and 4.8×–9.8× for mid-point generation versus CPU implementations.
  • 5×–15× end-to-end acceleration on a single NVIDIA RTX 5090D GPU compared with state-of-the-art methods (MidSurfer, Parasolid, Zhu et al.).
  • Complete dilation eliminates the gaps and truncations of prior methods while keeping the maximum positional deviation within 10-5 (FP32 vs. FP64).


 

Benchmark: Eight iconic models (M1–M8) selected from the GrabCAD library cover a diverse range of geometric and topological configurations, including constant- and variable-thickness FGPs (C/V-FGPs), high curvature, n-n pairings, and complex free-form surfaces.

  • Model 1: An instrument microphone clip, consisting of 30 faces with no constant-thickness FGPs (all 11 FGPs are variable), used to validate the algorithm’s basic capabilities.
  • Model 2: A motorcycle winglet with 61 faces, including 15 variable-thickness FGPs (V-FGPs) and 3 constant-thickness FGPs (C-FGPs). This model presents a scenario where both C-FGPs and V-FGPs exist.
  • Model 3: A pipe bracket with 74 faces, where most V-FGPs exhibit significant curvature variations (>15° angular variation between paired faces), used to evaluate geometric correctness under challenging dilation conditions.
  • Model 4: A car wheel rim from the automotive industry, featuring 121 faces, including 20 V-FGPs and 19 n-n face pairs. Its multi-ring topology, local blends, and diverse variable-thickness scenarios pose significant challenges for face pairing.
  • Model 5: A table flower pot with 195 faces, containing the most V-FGPs (24) among all benchmarks. It demonstrates the algorithm’s capability to handle models with great curvature variations.
  • Model 6: A multi-port distributor from the mechanical design domain, with 215 faces and 52 C-FGPs — the highest C-FGP count among all benchmarks — demonstrating scenarios where constant-thickness structures predominate.
  • Model 7: A rear spoiler with 236 faces, with all FGPs exhibiting highly similar geometric primitives and gentle curvature variations, testing the filtering gates under low-discrimination conditions.
  • Model 8: An aircraft model from a real-world aerospace scenario with 286 faces and highly complex topological and geometric structures, used to verify algorithm robustness under the most challenging configurations.

Contents

Paper (PDF 5.5 MB)

Supplementary Video (MP4 99.7 MB)

Li Ye, Xinhang Zhou, Xingyu Yang, Peng Fan, Ruofeng Tong, Hailong Li, Peng Du and Min Tang. 2026. gMidSurf: Hierarchical GPU-based Mid-surface Abstraction for Thin-walled CAD Models. Computer-Aided Design (To appear).

   @article{ye26gmidsurf,
      author = {Ye, Li and Zhou, Xinhang and Yang, Xingyu and Fan, Peng and Tong, Ruofeng and Li, Hailong and Du, Peng and Tang, Min},
      title = {gMidSurf: Hierarchical GPU-based Mid-surface Abstraction for Thin-walled CAD Models},
      journal = {Computer-Aided Design},
      year = {2026},
      publisher = {Elsevier}
   }

 

Related Links

MidSurfer: Efficient Mid-surface Abstraction from Variable Thin-walled Models

gDist: Efficient Distance Computation between 3D Meshes on GPU

CTSN: Predicting Cloth Deformation for Skeleton-based Characters with a Two-stream Skinning Network

D-Cloth: Skinning-based Cloth Dynamic Prediction with a Three-stage Network

N-Cloth: Predicting 3D Cloth Deformation with Mesh-Based Networks

P-Cloth: Interactive Cloth Simulation on Multi-GPU Systems using Dynamic Matrix Assembly and Pipelined Implicit Integrators

I-Cloth: Incremental Collision Handling for GPU-Based Interactive Cloth Simulation

PSCC: Parallel Self-Collision Culling with Spatial Hashing on GPUs

I-Cloth: API for fast and reliable cloth simulation with CUDA

Efficient BVH-based Collision Detection Scheme with Ordering and Restructuring

MCCD: Multi-Core Collision Detection between Deformable Models using Front-Based Decomposition

Interactive Continuous Collision Detection between Deformable Models using Connectivity-Based Culling

TightCCD: Efficient and Robust Continuous Collision Detection using Tight Error Bounds

Fast and Exact Continuous Collision Detection with Bernstein Sign Classification

A GPU-based Streaming Algorithm for High-Resolution Cloth Simulation

Continuous Penalty Forces

UNC dynamic model benchmark repository

Collision-Streams: Fast GPU-based Collision Detection for Deformable Models

Fast Continuous Collision Detection using Deforming Non-Penetration Filters

Fast Collision Detection for Deformable Models using Representative-Triangles

DeformCD: Collision Detection between Deforming Objects

Self-CCD: Continuous Collision Detection for Deforming Objects

Interactive Collision Detection between Deformable Models using Chromatic Decomposition

Fast Proximity Computation Among Deformable Models using Discrete Voronoi Diagrams

CULLIDE: Interactive Collision Detection between Complex Models using Graphics Hardware

RCULLIDE: Fast and Reliable Collision Culling using Graphics Processors

Quick-CULLIDE: Efficient Inter- and Intra-Object Collision Culling using Graphics Hardware

Collision Detection

UNC GAMMA Group

Acknowledgements

This work was funded in part by "Pioneer" and "Leading Goose" R&D Program of Zhejiang Province (No. 2025C01086).

 


tang_m@zju.edu.cn