Microservices have become an important design paradigm for large-scale distributed systems, offering flexible provisioning options. A fundamental challenge is the exponential growth of the solution space with the number of user requests, posing challenges to efficient provisioning and scheduling when aiming to balance cost and latency under resource constraints in large-scale dynamic edge environments. To tackle this problem, we formulate a joint optimization model for microservice provisioning and routing that integrates cost efficiency and latency reduction while accounting for uncertainties in the origin location of requests. To establish a unified framework that facilitates decision-making, we propose an integer linear programming (ILP) model that captures the dependencies between microservices in the service chain. Our Scalable optimization framework with Cost-efficiency and Latency reduction (SoCL) comprises three stages: an initial partitioning guarantees latency bounds, a pre-provisioning stage considers provisioning cost, and a multi-scale combination stage balances cost and latency through parallel and serial local search. Extensive experiments conducted across diverse scenarios based on a commonly used dataset demonstrate that the proposed SoCL framework significantly increases cost efficiency and decreases latency compared to established baselines, while reducing execution time up to one order of magnitude compared to obtaining the optimal solution by optimizer.