c++11 - How to efficiently delete elements from vector c++ -


i have vector of pair of vectors(v1,v2) called pairv1v2 of following form:

(1,2,3),(938,462,4837) -> (v1,v2) (3,9,13),(938,0472,944) (81,84,93),(938,84,845) 

then need retain following:

(1,2,3),(938,462,4837) -> (v1,v2) (3,9,13),(938,0472,944) (81,84,93),(84,845) 

i need start scanning pairv1v2 beginning , ever 2 v1's not equal, there need delete intersecting elements v2. wrote following code doing same. however, code turns out inefficient vector pairv1v2 big , has many elements in v2 (about billion).

int main(int argc, char** argv) {     std::vector<std::pair<std::vector<unsigned>, std::vector<unsigned> > > pairv1v2;     std::vector<std::pair <std::vector<unsigned>,std::vector<unsigned> > >::iterator itm2,lm2=pairv1v2.end();     for(std::vector<std::pair <std::vector<unsigned>,std::vector<unsigned> > >::iterator itm=pairv1v2.begin(), lm=pairv1v2.end(); itm!=lm; ++itm)     {         //outer values         vector<unsigned> outerv1=(*itm).first;         vector<unsigned> outerv2=(*itm).second;         sort(outerv2.begin(), outerv2.end());         itm2=itm;         itm2++;         for(itm2;itm2!=lm2;++itm2)         {             vector<unsigned> innerv1=(*itm2).first;             vector<unsigned> innerv2=(*itm2).second;             vector<unsigned> setdiffv1;             std::set_difference(innerv1.begin(), innerv1.end(), outerv1.begin(), outerv1.end(),                                                       std::inserter(setdiffv1, setdiffv1.end()));                         if(setdiffv1.size()==0) //check whether 2 v1's different             {                                  sort(innerv2.begin(), innerv2.end());                 if((itm->second.size()!=0)&&(itm2->second.size()!=0)){                                                     std::vector<unsigned> delintersectingelem;                     std::set_intersection(outerv2.begin(),outerv2.end(),innerv2.begin(), innerv2.end(),                               std::back_inserter(delintersectingelem));                     if(delintersectingelem.size()!=0) //if there intersecting v2's                    {                                             for(std::vector<unsigned>::iterator its=(itm2->second).begin(),ls=(itm2->second).end();its!=ls;)                         {                              //if *its present in delintersectingelem delete it.                             if(!(std::find(delintersectingelem.begin(), delintersectingelem.end(), (*its)) == delintersectingelem.end()))                             {                                 (itm2->second).erase(its); //delete intersecting elements inner v2                                 ls--;                             }else{                                 ++its;                             }                         }                                         }                 }             }          }     }         return 0; } 

can please me in improving present code -- gives correct answer (in example might have missed few case brevity-- code handles of them) extremely slow (as diagonalised perf). i'll grateful if improvements suggestion in present piece of code. however, if logic of 2 codes same, new algorithm acceptable

the efficient way remove element vector back-swap trick, applies if don't care order.

#include <vector> #include <iostream>  int main() {     std::vector<int> v { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };     auto = v.begin() + 5;     // replace current element of vector,     // shrink size of vector 1.     *it = std::move(v.back());     v.pop_back();      (auto n : v) {         std::cout << n << " ";     }     std::cout << "\n"; } 

http://ideone.com/0jbwhz

if know there going many deletes or large vector, may retain efficiency using trick, remembering not ++ current iterator after doing delete, , std::sort()ing vector when reach end.

--- edit ---

#include <algorithm> #include <iostream> #include <vector>  //! efficiently remove element vector without //! preserving order. if element not last element //! in vector, transfer last element position //! using move if possible. //! regardless, shrink size of vector deleting //! element @ end, either destructed or //! element deleting. //! @note: invalidates current iterator. template<class valuetype> bool unstable_remove(     typename std::vector<valuetype>& container,     typename std::vector<valuetype>::iterator     ) {     // leave in-situ if tail element.     auto lastel = container.end() - 1;     if (it != lastel) {         // overwrite element in last,         // should have same effect deleting this.         *it = std::move(*lastel);     }     // release last cell of vector, because should     // either destructed or contain value     // deleting.     container.pop_back(); }  int main() {     std::vector<int> ints { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };     auto = ints.begin();     while (it != ints.end()) {         if ((*it % 3) == 0) {             unstable_remove(ints, it);             // not pass go / ++it             continue;         }         ++it;     }     std::cout << "after removes:\n";     (auto val : ints)         std::cout << val << " ";     std::cout << "\n";     std::sort(ints.begin(), ints.end());     std::cout << "after sort:\n";     (auto val : ints)         std::cout << val << " ";     std::cout << "\n"; } 

produces (http://ideone.com/hgzpoc)

after removes: 1 2 10 4 5 8  after sort: 1 2 4 5 8 10  

--- edit 2 ---

here's cleanup of code readability, ditched end captures because ... deleting elements.

#include <vector> #include <cstdint>  using vec_t = std::vector<uint32_t>; using vecpair_t = std::pair<vec_t, vec_t>; using pairvec_t = std::vector<vecpair_t>;  int main(int argc, char** argv) {     pairvec_t pairv1v2;     for(auto itm = pairv1v2.begin(); itm != pairv1v2.end(); ++itm)     {         //outer values         auto& outerv1 = itm->first; // note '&' - reference not copy!         auto& outerv2 = itm->second;         sort(outerv2.begin(), outerv2.end());         for(auto itm2 = itm + 1; itm2 != pairv1v2.end(); ++itm2)         {             auto& innerv1 = itm2->first;             auto& innerv2 = itm2->second;             vec_t setdiffv1; 

as way optimize - since lists sorted - walk both lists @ same time comparing values.

template<typename valuetype> void dedupe_vectors(     typename std::vector<valuetype>& lhs,     typename std::vector<valuetype>& rhs     ) {     auto lit = lhs.begin();     auto rit = rhs.begin();     while (rit != rhs.end) {         while (lit != lhs.end() && *lit < *rit)             ++lit;         if (lit == lhs.end())             break;         if (*lit == *rit) {             v2.erase(rit);             continue;         }           ++rit;     } } 

i know - test lit vs lhs.end twice. take @ code compiler generates -o3 , see if doesn't detect itself. if so, can worry optimizing it.


Comments